A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

The key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredi...

Full description

Bibliographic Details
Main Authors: Yan Yin, Zhiyu Chen, Gang Liu, Jianwei Guo
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/4/2036
_version_ 1827755535804399616
author Yan Yin
Zhiyu Chen
Gang Liu
Jianwei Guo
author_facet Yan Yin
Zhiyu Chen
Gang Liu
Jianwei Guo
author_sort Yan Yin
collection DOAJ
description The key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredictability. This paper proposes an end-to-end local path planner n-step dueling double DQN with reward-based <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy (RND3QN) based on a deep reinforcement learning framework, which acquires environmental data from LiDAR as input and uses a neural network to fit Q-values to output the corresponding discrete actions. The bias is reduced using n-step bootstrapping based on deep Q-network (DQN). The <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy exploration-exploitation strategy is improved with the reward value as a measure of exploration, and an auxiliary reward function is introduced to increase the reward distribution of the sparse reward environment. Simulation experiments are conducted on the gazebo to test the algorithm’s effectiveness. The experimental data demonstrate that the average total reward value of RND3QN is higher than that of algorithms such as dueling double DQN (D3QN), and the success rates are increased by 174%, 65%, and 61% over D3QN on three stages, respectively. We experimented on the turtlebot3 waffle pi robot, and the strategies learned from the simulation can be effectively transferred to the real robot.
first_indexed 2024-03-11T08:10:43Z
format Article
id doaj.art-54afa341bfe443cdbeade45e0b7a9e5d
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-11T08:10:43Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-54afa341bfe443cdbeade45e0b7a9e5d2023-11-16T23:09:29ZengMDPI AGSensors1424-82202023-02-01234203610.3390/s23042036A Mapless Local Path Planning Approach Using Deep Reinforcement Learning FrameworkYan Yin0Zhiyu Chen1Gang Liu2Jianwei Guo3School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaSchool of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaSchool of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaSchool of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaThe key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredictability. This paper proposes an end-to-end local path planner n-step dueling double DQN with reward-based <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy (RND3QN) based on a deep reinforcement learning framework, which acquires environmental data from LiDAR as input and uses a neural network to fit Q-values to output the corresponding discrete actions. The bias is reduced using n-step bootstrapping based on deep Q-network (DQN). The <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy exploration-exploitation strategy is improved with the reward value as a measure of exploration, and an auxiliary reward function is introduced to increase the reward distribution of the sparse reward environment. Simulation experiments are conducted on the gazebo to test the algorithm’s effectiveness. The experimental data demonstrate that the average total reward value of RND3QN is higher than that of algorithms such as dueling double DQN (D3QN), and the success rates are increased by 174%, 65%, and 61% over D3QN on three stages, respectively. We experimented on the turtlebot3 waffle pi robot, and the strategies learned from the simulation can be effectively transferred to the real robot.https://www.mdpi.com/1424-8220/23/4/2036D3QNexploration-exploitationturtlebot3n-stepauxiliary reward functionspath planning
spellingShingle Yan Yin
Zhiyu Chen
Gang Liu
Jianwei Guo
A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
Sensors
D3QN
exploration-exploitation
turtlebot3
n-step
auxiliary reward functions
path planning
title A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_full A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_fullStr A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_full_unstemmed A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_short A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_sort mapless local path planning approach using deep reinforcement learning framework
topic D3QN
exploration-exploitation
turtlebot3
n-step
auxiliary reward functions
path planning
url https://www.mdpi.com/1424-8220/23/4/2036
work_keys_str_mv AT yanyin amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT zhiyuchen amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT gangliu amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT jianweiguo amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT yanyin maplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT zhiyuchen maplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT gangliu maplesslocalpathplanningapproachusingdeepreinforcementlearningframework
AT jianweiguo maplesslocalpathplanningapproachusingdeepreinforcementlearningframework