Cautious reinforcement learning with logical constraints

This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies are synthesised to satisfy a goal, expressed as a temporal logic formula, with maximal probability. E...

Full description

Bibliographic Details
Main Authors: Hasanbeig, M, Abate, A, Kroening, D
Format: Conference item
Language:English
Published: International Foundation for Autonomous Agents and Multiagent Systems 2020