A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

The key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredi...

Full description

Bibliographic Details
Main Authors:	Yan Yin, Zhiyu Chen, Gang Liu, Jianwei Guo
Format:	Article
Language:	English
Published:	MDPI AG 2023-02-01
Series:	Sensors
Subjects:	D3QN exploration-exploitation turtlebot3 n-step auxiliary reward functions path planning
Online Access:	https://www.mdpi.com/1424-8220/23/4/2036

_version_	1827755535804399616
author	Yan Yin Zhiyu Chen Gang Liu Jianwei Guo
author_facet	Yan Yin Zhiyu Chen Gang Liu Jianwei Guo
author_sort	Yan Yin
collection	DOAJ
description	The key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredictability. This paper proposes an end-to-end local path planner n-step dueling double DQN with reward-based <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy (RND3QN) based on a deep reinforcement learning framework, which acquires environmental data from LiDAR as input and uses a neural network to fit Q-values to output the corresponding discrete actions. The bias is reduced using n-step bootstrapping based on deep Q-network (DQN). The <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy exploration-exploitation strategy is improved with the reward value as a measure of exploration, and an auxiliary reward function is introduced to increase the reward distribution of the sparse reward environment. Simulation experiments are conducted on the gazebo to test the algorithm’s effectiveness. The experimental data demonstrate that the average total reward value of RND3QN is higher than that of algorithms such as dueling double DQN (D3QN), and the success rates are increased by 174%, 65%, and 61% over D3QN on three stages, respectively. We experimented on the turtlebot3 waffle pi robot, and the strategies learned from the simulation can be effectively transferred to the real robot.
first_indexed	2024-03-11T08:10:43Z
format	Article
id	doaj.art-54afa341bfe443cdbeade45e0b7a9e5d
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T08:10:43Z
publishDate	2023-02-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-54afa341bfe443cdbeade45e0b7a9e5d2023-11-16T23:09:29ZengMDPI AGSensors1424-82202023-02-01234203610.3390/s23042036A Mapless Local Path Planning Approach Using Deep Reinforcement Learning FrameworkYan Yin0Zhiyu Chen1Gang Liu2Jianwei Guo3School of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaSchool of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaSchool of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaSchool of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, ChinaThe key module for autonomous mobile robots is path planning and obstacle avoidance. Global path planning based on known maps has been effectively achieved. Local path planning in unknown dynamic environments is still very challenging due to the lack of detailed environmental information and unpredictability. This paper proposes an end-to-end local path planner n-step dueling double DQN with reward-based <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy (RND3QN) based on a deep reinforcement learning framework, which acquires environmental data from LiDAR as input and uses a neural network to fit Q-values to output the corresponding discrete actions. The bias is reduced using n-step bootstrapping based on deep Q-network (DQN). The <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mi>ϵ</mi></semantics></math></inline-formula>-greedy exploration-exploitation strategy is improved with the reward value as a measure of exploration, and an auxiliary reward function is introduced to increase the reward distribution of the sparse reward environment. Simulation experiments are conducted on the gazebo to test the algorithm’s effectiveness. The experimental data demonstrate that the average total reward value of RND3QN is higher than that of algorithms such as dueling double DQN (D3QN), and the success rates are increased by 174%, 65%, and 61% over D3QN on three stages, respectively. We experimented on the turtlebot3 waffle pi robot, and the strategies learned from the simulation can be effectively transferred to the real robot.https://www.mdpi.com/1424-8220/23/4/2036D3QNexploration-exploitationturtlebot3n-stepauxiliary reward functionspath planning
spellingShingle	Yan Yin Zhiyu Chen Gang Liu Jianwei Guo A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework Sensors D3QN exploration-exploitation turtlebot3 n-step auxiliary reward functions path planning
title	A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_full	A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_fullStr	A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_full_unstemmed	A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_short	A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework
title_sort	mapless local path planning approach using deep reinforcement learning framework
topic	D3QN exploration-exploitation turtlebot3 n-step auxiliary reward functions path planning
url	https://www.mdpi.com/1424-8220/23/4/2036
work_keys_str_mv	AT yanyin amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT zhiyuchen amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT gangliu amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT jianweiguo amaplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT yanyin maplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT zhiyuchen maplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT gangliu maplesslocalpathplanningapproachusingdeepreinforcementlearningframework AT jianweiguo maplesslocalpathplanningapproachusingdeepreinforcementlearningframework

A Mapless Local Path Planning Approach Using Deep Reinforcement Learning Framework

Similar Items