Safe reinforcement learning for multi-energy management systems with known constraint functions
Reinforcement learning (RL) is a promising optimal control technique for multi-energy management systems. It does not require a model a priori - reducing the upfront and ongoing project-specific engineering effort and is capable of learning better representations of the underlying system dynamics. H...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-04-01
|
Series: | Energy and AI |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2666546822000738 |
_version_ | 1797833420425920512 |
---|---|
author | Glenn Ceusters Luis Ramirez Camargo Rüdiger Franke Ann Nowé Maarten Messagie |
author_facet | Glenn Ceusters Luis Ramirez Camargo Rüdiger Franke Ann Nowé Maarten Messagie |
author_sort | Glenn Ceusters |
collection | DOAJ |
description | Reinforcement learning (RL) is a promising optimal control technique for multi-energy management systems. It does not require a model a priori - reducing the upfront and ongoing project-specific engineering effort and is capable of learning better representations of the underlying system dynamics. However, vanilla RL does not provide constraint satisfaction guarantees — resulting in various potentially unsafe interactions within its environment. In this paper, we present two novel online model-free safe RL methods, namely SafeFallback and GiveSafe, where the safety constraint formulation is decoupled from the RL formulation. These provide hard-constraint satisfaction guarantees both during training and deployment of the (near) optimal policy. This is without the need of solving a mathematical program, resulting in less computational power requirements and more flexible constraint function formulations. In a simulated multi-energy systems case study we have shown that both methods start with a significantly higher utility compared to a vanilla RL benchmark and Optlayer benchmark (94,6% and 82,8% compared to 35,5% and 77,8%) and that the proposed SafeFallback method even can outperform the vanilla RL benchmark (102,9% to 100%). We conclude that both methods are viably safety constraint handling techniques applicable beyond RL, as demonstrated with random policies while still providing hard-constraint guarantees. |
first_indexed | 2024-04-09T14:23:55Z |
format | Article |
id | doaj.art-68aca0d2b26848d1b84a4a728e7440e0 |
institution | Directory Open Access Journal |
issn | 2666-5468 |
language | English |
last_indexed | 2024-04-09T14:23:55Z |
publishDate | 2023-04-01 |
publisher | Elsevier |
record_format | Article |
series | Energy and AI |
spelling | doaj.art-68aca0d2b26848d1b84a4a728e7440e02023-05-04T10:44:11ZengElsevierEnergy and AI2666-54682023-04-0112100227Safe reinforcement learning for multi-energy management systems with known constraint functionsGlenn Ceusters0Luis Ramirez Camargo1Rüdiger Franke2Ann Nowé3Maarten Messagie4ABB, Hoge Wei 27, 1930 Zaventem, Belgium; Vrije Universiteit Brussel (VUB), ETEC-MOBI, Pleinlaan 2, 1050 Brussels, Belgium; Vrije Universiteit Brussel (VUB), AI-lab, Pleinlaan 2, 1050 Brussels, Belgium; Corresponding author at: Vrije Universiteit Brussel (VUB), ETEC-MOBI, Pleinlaan 2, 1050 Brussels, Belgium.Vrije Universiteit Brussel (VUB), ETEC-MOBI, Pleinlaan 2, 1050 Brussels, Belgium; Copernicus Institute of Sustainable Development - Utrecht University, Princetonlaan 8a, 3584, CB Utrecht, NetherlandsABB, Hoge Wei 27, 1930 Zaventem, BelgiumVrije Universiteit Brussel (VUB), AI-lab, Pleinlaan 2, 1050 Brussels, BelgiumVrije Universiteit Brussel (VUB), ETEC-MOBI, Pleinlaan 2, 1050 Brussels, BelgiumReinforcement learning (RL) is a promising optimal control technique for multi-energy management systems. It does not require a model a priori - reducing the upfront and ongoing project-specific engineering effort and is capable of learning better representations of the underlying system dynamics. However, vanilla RL does not provide constraint satisfaction guarantees — resulting in various potentially unsafe interactions within its environment. In this paper, we present two novel online model-free safe RL methods, namely SafeFallback and GiveSafe, where the safety constraint formulation is decoupled from the RL formulation. These provide hard-constraint satisfaction guarantees both during training and deployment of the (near) optimal policy. This is without the need of solving a mathematical program, resulting in less computational power requirements and more flexible constraint function formulations. In a simulated multi-energy systems case study we have shown that both methods start with a significantly higher utility compared to a vanilla RL benchmark and Optlayer benchmark (94,6% and 82,8% compared to 35,5% and 77,8%) and that the proposed SafeFallback method even can outperform the vanilla RL benchmark (102,9% to 100%). We conclude that both methods are viably safety constraint handling techniques applicable beyond RL, as demonstrated with random policies while still providing hard-constraint guarantees.http://www.sciencedirect.com/science/article/pii/S2666546822000738Reinforcement learningConstraintsMulti-energy systemsEnergy management system |
spellingShingle | Glenn Ceusters Luis Ramirez Camargo Rüdiger Franke Ann Nowé Maarten Messagie Safe reinforcement learning for multi-energy management systems with known constraint functions Energy and AI Reinforcement learning Constraints Multi-energy systems Energy management system |
title | Safe reinforcement learning for multi-energy management systems with known constraint functions |
title_full | Safe reinforcement learning for multi-energy management systems with known constraint functions |
title_fullStr | Safe reinforcement learning for multi-energy management systems with known constraint functions |
title_full_unstemmed | Safe reinforcement learning for multi-energy management systems with known constraint functions |
title_short | Safe reinforcement learning for multi-energy management systems with known constraint functions |
title_sort | safe reinforcement learning for multi energy management systems with known constraint functions |
topic | Reinforcement learning Constraints Multi-energy systems Energy management system |
url | http://www.sciencedirect.com/science/article/pii/S2666546822000738 |
work_keys_str_mv | AT glennceusters safereinforcementlearningformultienergymanagementsystemswithknownconstraintfunctions AT luisramirezcamargo safereinforcementlearningformultienergymanagementsystemswithknownconstraintfunctions AT rudigerfranke safereinforcementlearningformultienergymanagementsystemswithknownconstraintfunctions AT annnowe safereinforcementlearningformultienergymanagementsystemswithknownconstraintfunctions AT maartenmessagie safereinforcementlearningformultienergymanagementsystemswithknownconstraintfunctions |