Safe reinforcement learning for high-speed autonomous racing
The conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
KeAi Communications Co. Ltd.
2023-01-01
|
Series: | Cognitive Robotics |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2667241323000125 |
_version_ | 1797676538432323584 |
---|---|
author | Benjamin D. Evans Hendrik W. Jordaan Herman A. Engelbrecht |
author_facet | Benjamin D. Evans Hendrik W. Jordaan Herman A. Engelbrecht |
author_sort | Benjamin D. Evans |
collection | DOAJ |
description | The conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use. This paper addresses the problem of safely training DRL agents for autonomous racing. Firstly, we present a Viability Theory-based supervisor that ensures the vehicle does not crash and remains within the friction limit while maintaining recursive feasibility. Secondly, we use the supervisor to ensure the vehicle does not crash during the training of DRL agents for high-speed racing. The evaluation in the open-source F1Tenth simulator demonstrates that our safety system can ensure the safety of a worst-case scenario planner on four test maps up to speeds of 6 m/s. Training agents to race with the supervisor significantly improves sample efficiency, requiring only 10,000 steps. Our learning formulation leads to learning more conservative, safer policies with slower lap times and a higher success rate, resulting in our method being feasible for physical vehicle racing. Enabling DRL agents to learn to race without ever crashing is a step towards using DRL on physical vehicles. |
first_indexed | 2024-03-11T22:30:39Z |
format | Article |
id | doaj.art-2d471fbdba3b49a785fdb0a94e228b42 |
institution | Directory Open Access Journal |
issn | 2667-2413 |
language | English |
last_indexed | 2024-03-11T22:30:39Z |
publishDate | 2023-01-01 |
publisher | KeAi Communications Co. Ltd. |
record_format | Article |
series | Cognitive Robotics |
spelling | doaj.art-2d471fbdba3b49a785fdb0a94e228b422023-09-23T05:13:16ZengKeAi Communications Co. Ltd.Cognitive Robotics2667-24132023-01-013107126Safe reinforcement learning for high-speed autonomous racingBenjamin D. Evans0Hendrik W. Jordaan1Herman A. Engelbrecht2Corresponding author.; Stellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaStellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaStellenbosch University, Electrical and Electronic Engineering, Banghoek Road, Stellenbosch, South AfricaThe conventional application of deep reinforcement learning (DRL) to autonomous racing requires the agent to crash during training, thus limiting training to simulation environments. Further, many DRL approaches still exhibit high crash rates after training, making them infeasible for real-world use. This paper addresses the problem of safely training DRL agents for autonomous racing. Firstly, we present a Viability Theory-based supervisor that ensures the vehicle does not crash and remains within the friction limit while maintaining recursive feasibility. Secondly, we use the supervisor to ensure the vehicle does not crash during the training of DRL agents for high-speed racing. The evaluation in the open-source F1Tenth simulator demonstrates that our safety system can ensure the safety of a worst-case scenario planner on four test maps up to speeds of 6 m/s. Training agents to race with the supervisor significantly improves sample efficiency, requiring only 10,000 steps. Our learning formulation leads to learning more conservative, safer policies with slower lap times and a higher success rate, resulting in our method being feasible for physical vehicle racing. Enabling DRL agents to learn to race without ever crashing is a step towards using DRL on physical vehicles.http://www.sciencedirect.com/science/article/pii/S2667241323000125Reinforcement learningSafe learningAutonomous racingSafe autonomous systems |
spellingShingle | Benjamin D. Evans Hendrik W. Jordaan Herman A. Engelbrecht Safe reinforcement learning for high-speed autonomous racing Cognitive Robotics Reinforcement learning Safe learning Autonomous racing Safe autonomous systems |
title | Safe reinforcement learning for high-speed autonomous racing |
title_full | Safe reinforcement learning for high-speed autonomous racing |
title_fullStr | Safe reinforcement learning for high-speed autonomous racing |
title_full_unstemmed | Safe reinforcement learning for high-speed autonomous racing |
title_short | Safe reinforcement learning for high-speed autonomous racing |
title_sort | safe reinforcement learning for high speed autonomous racing |
topic | Reinforcement learning Safe learning Autonomous racing Safe autonomous systems |
url | http://www.sciencedirect.com/science/article/pii/S2667241323000125 |
work_keys_str_mv | AT benjamindevans safereinforcementlearningforhighspeedautonomousracing AT hendrikwjordaan safereinforcementlearningforhighspeedautonomousracing AT hermanaengelbrecht safereinforcementlearningforhighspeedautonomousracing |