Stable and Efficient Reinforcement Learning Method for Avoidance Driving of Unmanned Vehicles

Reinforcement learning (RL) has demonstrated considerable potential in solving challenges across various domains, notably in autonomous driving. Nevertheless, implementing RL in autonomous driving comes with its own set of difficulties, such as the overestimation phenomenon, extensive learning time,...

Full description

Bibliographic Details
Main Authors: Sun-Ho Jang, Woo-Jin Ahn, Yu-Jin Kim, Hyung-Gil Hong, Dong-Sung Pae, Myo-Taeg Lim
Format: Article
Language:English
Published: MDPI AG 2023-09-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/18/3773
Description
Summary:Reinforcement learning (RL) has demonstrated considerable potential in solving challenges across various domains, notably in autonomous driving. Nevertheless, implementing RL in autonomous driving comes with its own set of difficulties, such as the overestimation phenomenon, extensive learning time, and sparse reward problems. Although solutions like hindsight experience replay (HER) have been proposed to alleviate these issues, the direct utilization of RL in autonomous vehicles remains constrained due to the intricate fusion of information and the possibility of system failures during the learning process. In this paper, we present a novel RL-based autonomous driving system technology that combines obstacle-dependent Gaussian (ODG) RL, soft actor-critic (SAC), and meta-learning algorithms. Our approach addresses key issues in RL, including the overestimation phenomenon and sparse reward problems, by incorporating prior knowledge derived from the ODG algorithm. With these solutions in place, the ultimate aim of this work is to improve the performance of reinforcement learning and develop a swift, stable, and robust learning method for implementing autonomous driving systems that can effectively adapt to various environments and overcome the constraints of direct RL utilization in autonomous vehicles. We evaluated our proposed algorithm on official F1 circuits, using high-fidelity racing simulations with complex dynamics. The results demonstrate exceptional performance, with our method achieving up to 89% faster learning speed compared to existing algorithms in these environments.
ISSN:2079-9292