Safe reinforcement learning for high-speed autonomous racing

The conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use...

Full description

Bibliographic Details
Main Authors:	Benjamin D. Evans, Hendrik W. Jordaan, Herman A. Engelbrecht
Format:	Article
Language:	English
Published:	KeAi Communications Co. Ltd. 2023-01-01
Series:	Cognitive Robotics
Subjects:	Reinforcement learning Safe learning Autonomous racing Safe autonomous systems
Online Access:	http://www.sciencedirect.com/science/article/pii/S2667241323000125

_version_	1797676538432323584
author	Benjamin D. Evans Hendrik W. Jordaan Herman A. Engelbrecht
author_facet	Benjamin D. Evans Hendrik W. Jordaan Herman A. Engelbrecht
author_sort	Benjamin D. Evans
collection	DOAJ
description	The conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use. This paper addresses the problem of safely training DRL agents for autonomous racing. Firstly, we present a Viability Theory-based supervisor that ensures the vehicle does not crash and remains within the friction limit while maintaining recursive feasibility. Secondly, we use the supervisor to ensure the vehicle does not crash during the training of DRL agents for high-speed racing. The evaluation in the open-source F1Tenth simulator demonstrates that our safety system can ensure the safety of a worst-case scenario planner on four test maps up to speeds of 6 m/s. Training agents to race with the supervisor significantly improves sample efficiency, requiring only 10,000 steps. Our learning formulation leads to learning more conservative, safer policies with slower lap times and a higher success rate, resulting in our method being feasible for physical vehicle racing. Enabling DRL agents to learn to race without ever crashing is a step towards using DRL on physical vehicles.
first_indexed	2024-03-11T22:30:39Z
format	Article
id	doaj.art-2d471fbdba3b49a785fdb0a94e228b42
institution	Directory Open Access Journal
issn	2667-2413
language	English
last_indexed	2024-03-11T22:30:39Z
publishDate	2023-01-01
publisher	KeAi Communications Co. Ltd.
record_format	Article
series	Cognitive Robotics
spelling	doaj.art-2d471fbdba3b49a785fdb0a94e228b422023-09-23T05:13:16ZengKeAi Communications Co. Ltd.Cognitive Robotics2667-24132023-01-013107126Safe reinforcement learning for high-speed autonomous racingBenjamin D. Evans0Hendrik W. Jordaan1Herman A. Engelbrecht2Corresponding author.; Stellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaStellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaStellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaThe conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use. This paper addresses the problem of safely training DRL agents for autonomous racing. Firstly, we present a Viability Theory-based supervisor that ensures the vehicle does not crash and remains within the friction limit while maintaining recursive feasibility. Secondly, we use the supervisor to ensure the vehicle does not crash during the training of DRL agents for high-speed racing. The evaluation in the open-source F1Tenth simulator demonstrates that our safety system can ensure the safety of a worst-case scenario planner on four test maps up to speeds of 6 m/s. Training agents to race with the supervisor significantly improves sample efficiency, requiring only 10,000 steps. Our learning formulation leads to learning more conservative, safer policies with slower lap times and a higher success rate, resulting in our method being feasible for physical vehicle racing. Enabling DRL agents to learn to race without ever crashing is a step towards using DRL on physical vehicles.http://www.sciencedirect.com/science/article/pii/S2667241323000125Reinforcement learningSafe learningAutonomous racingSafe autonomous systems
spellingShingle	Benjamin D. Evans Hendrik W. Jordaan Herman A. Engelbrecht Safe reinforcement learning for high-speed autonomous racing Cognitive Robotics Reinforcement learning Safe learning Autonomous racing Safe autonomous systems
title	Safe reinforcement learning for high-speed autonomous racing
title_full	Safe reinforcement learning for high-speed autonomous racing
title_fullStr	Safe reinforcement learning for high-speed autonomous racing
title_full_unstemmed	Safe reinforcement learning for high-speed autonomous racing
title_short	Safe reinforcement learning for high-speed autonomous racing
title_sort	safe reinforcement learning for high speed autonomous racing
topic	Reinforcement learning Safe learning Autonomous racing Safe autonomous systems
url	http://www.sciencedirect.com/science/article/pii/S2667241323000125
work_keys_str_mv	AT benjamindevans safereinforcementlearningforhighspeedautonomousracing AT hendrikwjordaan safereinforcementlearningforhighspeedautonomousracing AT hermanaengelbrecht safereinforcementlearningforhighspeedautonomousracing

Safe reinforcement learning for high-speed autonomous racing

Similar Items