Safe reinforcement learning for high-speed autonomous racing

The conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use...

Full description

Bibliographic Details
Main Authors: Benjamin D. Evans, Hendrik W. Jordaan, Herman A. Engelbrecht
Format: Article
Language:English
Published: KeAi Communications Co. Ltd. 2023-01-01
Series:Cognitive Robotics
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2667241323000125
_version_ 1797676538432323584
author Benjamin D. Evans
Hendrik W. Jordaan
Herman A. Engelbrecht
author_facet Benjamin D. Evans
Hendrik W. Jordaan
Herman A. Engelbrecht
author_sort Benjamin D. Evans
collection DOAJ
description The conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use. This paper addresses the problem of safely training DRL agents for autonomous racing. Firstly, we present a Viability Theory-based supervisor that ensures the vehicle does not crash and remains within the friction limit while maintaining recursive feasibility. Secondly, we use the supervisor to ensure the vehicle does not crash during the training of DRL agents for high-speed racing. The evaluation in the open-source F1Tenth simulator demonstrates that our safety system can ensure the safety of a worst-case scenario planner on four test maps up to speeds of 6 m/s. Training agents to race with the supervisor significantly improves sample efficiency, requiring only 10,000 steps. Our learning formulation leads to learning more conservative, safer policies with slower lap times and a higher success rate, resulting in our method being feasible for physical vehicle racing. Enabling DRL agents to learn to race without ever crashing is a step towards using DRL on physical vehicles.
first_indexed 2024-03-11T22:30:39Z
format Article
id doaj.art-2d471fbdba3b49a785fdb0a94e228b42
institution Directory Open Access Journal
issn 2667-2413
language English
last_indexed 2024-03-11T22:30:39Z
publishDate 2023-01-01
publisher KeAi Communications Co. Ltd.
record_format Article
series Cognitive Robotics
spelling doaj.art-2d471fbdba3b49a785fdb0a94e228b422023-09-23T05:13:16ZengKeAi Communications Co. Ltd.Cognitive Robotics2667-24132023-01-013107126Safe reinforcement learning for high-speed autonomous racingBenjamin D. Evans0Hendrik W. Jordaan1Herman A. Engelbrecht2Corresponding author.; Stellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaStellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaStellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaThe conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use. This paper addresses the problem of safely training DRL agents for autonomous racing. Firstly, we present a Viability Theory-based supervisor that ensures the vehicle does not crash and remains within the friction limit while maintaining recursive feasibility. Secondly, we use the supervisor to ensure the vehicle does not crash during the training of DRL agents for high-speed racing. The evaluation in the open-source F1Tenth simulator demonstrates that our safety system can ensure the safety of a worst-case scenario planner on four test maps up to speeds of 6 m/s. Training agents to race with the supervisor significantly improves sample efficiency, requiring only 10,000 steps. Our learning formulation leads to learning more conservative, safer policies with slower lap times and a higher success rate, resulting in our method being feasible for physical vehicle racing. Enabling DRL agents to learn to race without ever crashing is a step towards using DRL on physical vehicles.http://www.sciencedirect.com/science/article/pii/S2667241323000125Reinforcement learningSafe learningAutonomous racingSafe autonomous systems
spellingShingle Benjamin D. Evans
Hendrik W. Jordaan
Herman A. Engelbrecht
Safe reinforcement learning for high-speed autonomous racing
Cognitive Robotics
Reinforcement learning
Safe learning
Autonomous racing
Safe autonomous systems
title Safe reinforcement learning for high-speed autonomous racing
title_full Safe reinforcement learning for high-speed autonomous racing
title_fullStr Safe reinforcement learning for high-speed autonomous racing
title_full_unstemmed Safe reinforcement learning for high-speed autonomous racing
title_short Safe reinforcement learning for high-speed autonomous racing
title_sort safe reinforcement learning for high speed autonomous racing
topic Reinforcement learning
Safe learning
Autonomous racing
Safe autonomous systems
url http://www.sciencedirect.com/science/article/pii/S2667241323000125
work_keys_str_mv AT benjamindevans safereinforcementlearningforhighspeedautonomousracing
AT hendrikwjordaan safereinforcementlearningforhighspeedautonomousracing
AT hermanaengelbrecht safereinforcementlearningforhighspeedautonomousracing