Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control

The real-time control optimization of electrified vehicles is one of the most demanding tasks to be faced in the innovation progress of low-emissions mobility. Intelligent energy management systems represent interesting solutions to solve complex control problems, such as the maximization of the fue...

Full description

Bibliographic Details
Main Authors: Matteo Acquarone, Claudio Maino, Daniela Misul, Ezio Spessa, Antonio Mastropietro, Luca Sorrentino, Enrico Busto
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Energies
Subjects:
Online Access:https://www.mdpi.com/1996-1073/16/6/2749
_version_ 1827750054069272576
author Matteo Acquarone
Claudio Maino
Daniela Misul
Ezio Spessa
Antonio Mastropietro
Luca Sorrentino
Enrico Busto
author_facet Matteo Acquarone
Claudio Maino
Daniela Misul
Ezio Spessa
Antonio Mastropietro
Luca Sorrentino
Enrico Busto
author_sort Matteo Acquarone
collection DOAJ
description The real-time control optimization of electrified vehicles is one of the most demanding tasks to be faced in the innovation progress of low-emissions mobility. Intelligent energy management systems represent interesting solutions to solve complex control problems, such as the maximization of the fuel economy of hybrid electric vehicles. In the recent years, reinforcement-learning-based controllers have been shown to outperform well-established real-time strategies for specific applications. Nevertheless, the effects produced by variation in the reward function have not been thoroughly analyzed and the potential of the adoption of a given RL agent under different testing conditions is still to be assessed. In the present paper, the performance of different agents, i.e., Q-learning, deep Q-Network and double deep Q-Network, are investigated considering a full hybrid electric vehicle throughout multiple driving missions and introducing two distinct reward functions. The first function aims at guaranteeing a charge-sustaining policy whilst reducing the fuel consumption (FC) as much as possible; the second function in turn aims at minimizing the fuel consumption whilst ensuring an acceptable battery state of charge (SOC) by the end of the mission. The novelty brought by the results of this paper lies in the demonstration of a non-trivial incapability of DQN and DDQN to outperform traditional Q-learning when a SOC-oriented reward is considered. On the contrary, optimal fuel consumption reductions are attained by DQN and DDQN when more complex FC-oriented minimization is deployed. Such an important outcome is particularly evident when the RL agents are trained on regulatory driving cycles and tested on unknown real-world driving missions.
first_indexed 2024-03-11T06:37:10Z
format Article
id doaj.art-b200f29bdd334f19827586da33e2e1bb
institution Directory Open Access Journal
issn 1996-1073
language English
last_indexed 2024-03-11T06:37:10Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Energies
spelling doaj.art-b200f29bdd334f19827586da33e2e1bb2023-11-17T10:50:27ZengMDPI AGEnergies1996-10732023-03-01166274910.3390/en16062749Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time ControlMatteo Acquarone0Claudio Maino1Daniela Misul2Ezio Spessa3Antonio Mastropietro4Luca Sorrentino5Enrico Busto6Interdepartmental Center for Automotive Research and Sustainable Mobility (CARS@PoliTO), Department of Energetics, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, ItalyInterdepartmental Center for Automotive Research and Sustainable Mobility (CARS@PoliTO), Department of Energetics, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, ItalyInterdepartmental Center for Automotive Research and Sustainable Mobility (CARS@PoliTO), Department of Energetics, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, ItalyInterdepartmental Center for Automotive Research and Sustainable Mobility (CARS@PoliTO), Department of Energetics, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129 Turin, ItalyDepartment of Data Science, EURECOM, Route des Chappes 450, 06904 Biot, FranceAddfor Industriale s.r.l., Piazza Solferino 7, 10121 Turin, ItalyAddfor Industriale s.r.l., Piazza Solferino 7, 10121 Turin, ItalyThe real-time control optimization of electrified vehicles is one of the most demanding tasks to be faced in the innovation progress of low-emissions mobility. Intelligent energy management systems represent interesting solutions to solve complex control problems, such as the maximization of the fuel economy of hybrid electric vehicles. In the recent years, reinforcement-learning-based controllers have been shown to outperform well-established real-time strategies for specific applications. Nevertheless, the effects produced by variation in the reward function have not been thoroughly analyzed and the potential of the adoption of a given RL agent under different testing conditions is still to be assessed. In the present paper, the performance of different agents, i.e., Q-learning, deep Q-Network and double deep Q-Network, are investigated considering a full hybrid electric vehicle throughout multiple driving missions and introducing two distinct reward functions. The first function aims at guaranteeing a charge-sustaining policy whilst reducing the fuel consumption (FC) as much as possible; the second function in turn aims at minimizing the fuel consumption whilst ensuring an acceptable battery state of charge (SOC) by the end of the mission. The novelty brought by the results of this paper lies in the demonstration of a non-trivial incapability of DQN and DDQN to outperform traditional Q-learning when a SOC-oriented reward is considered. On the contrary, optimal fuel consumption reductions are attained by DQN and DDQN when more complex FC-oriented minimization is deployed. Such an important outcome is particularly evident when the RL agents are trained on regulatory driving cycles and tested on unknown real-world driving missions.https://www.mdpi.com/1996-1073/16/6/2749artificial intelligencefuel consumptionhybrid electric vehiclesreal-time controlreinforcement learning
spellingShingle Matteo Acquarone
Claudio Maino
Daniela Misul
Ezio Spessa
Antonio Mastropietro
Luca Sorrentino
Enrico Busto
Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control
Energies
artificial intelligence
fuel consumption
hybrid electric vehicles
real-time control
reinforcement learning
title Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control
title_full Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control
title_fullStr Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control
title_full_unstemmed Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control
title_short Influence of the Reward Function on the Selection of Reinforcement Learning Agents for Hybrid Electric Vehicles Real-Time Control
title_sort influence of the reward function on the selection of reinforcement learning agents for hybrid electric vehicles real time control
topic artificial intelligence
fuel consumption
hybrid electric vehicles
real-time control
reinforcement learning
url https://www.mdpi.com/1996-1073/16/6/2749
work_keys_str_mv AT matteoacquarone influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol
AT claudiomaino influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol
AT danielamisul influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol
AT eziospessa influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol
AT antoniomastropietro influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol
AT lucasorrentino influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol
AT enricobusto influenceoftherewardfunctionontheselectionofreinforcementlearningagentsforhybridelectricvehiclesrealtimecontrol