Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot

Coverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based coverage path planning algorithms are beginning to be explore...

Full description

Bibliographic Details
Main Authors:	Meiyan Zhang, Wenyu Cai, Lingfeng Pang
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Access
Subjects:	Coverage path planning predator-prey model reinforcement learning Q-learning algorithm mobile robot
Online Access:	https://ieeexplore.ieee.org/document/10064303/

_version_	1797855316160806912
author	Meiyan Zhang Wenyu Cai Lingfeng Pang
author_facet	Meiyan Zhang Wenyu Cai Lingfeng Pang
author_sort	Meiyan Zhang
collection	DOAJ
description	Coverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based coverage path planning algorithms are beginning to be explored recently. To overcome the problem of traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning of easily falling into local optimum, in this paper, the new-type reward functions originating from Predator-Prey model are introduced into traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP solution, which introduces a comprehensive reward function that incorporates three rewards including Predation Avoidance Reward Function, Smoothness Reward Function and Boundary Reward Function. In addition, the influence of weighting parameters on the total reward function is discussed. Extensive simulation results and practical experiments verify that the proposed Predator-Prey reward based <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning Coverage Path Planning (PP-<inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in short) has better performance than traditional BCD and <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in terms of repetition ratio and turns number.
first_indexed	2024-04-09T20:21:43Z
format	Article
id	doaj.art-26a76fd3eafb4d17985684a977c1aa76
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-04-09T20:21:43Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-26a76fd3eafb4d17985684a977c1aa762023-03-30T23:01:21ZengIEEEIEEE Access2169-35362023-01-0111296732968310.1109/ACCESS.2023.325500710064303Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile RobotMeiyan Zhang0https://orcid.org/0000-0002-0396-5786Wenyu Cai1https://orcid.org/0000-0002-8858-9221Lingfeng Pang2College of Electrical Engineering, Zhejiang University of Water Resources and Electric Power, Hangzhou, ChinaCollege of Electronics and Information, Hangzhou Dianzi University, Hangzhou, ChinaCollege of Electronics and Information, Hangzhou Dianzi University, Hangzhou, ChinaCoverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based coverage path planning algorithms are beginning to be explored recently. To overcome the problem of traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning of easily falling into local optimum, in this paper, the new-type reward functions originating from Predator-Prey model are introduced into traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP solution, which introduces a comprehensive reward function that incorporates three rewards including Predation Avoidance Reward Function, Smoothness Reward Function and Boundary Reward Function. In addition, the influence of weighting parameters on the total reward function is discussed. Extensive simulation results and practical experiments verify that the proposed Predator-Prey reward based <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning Coverage Path Planning (PP-<inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in short) has better performance than traditional BCD and <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in terms of repetition ratio and turns number.https://ieeexplore.ieee.org/document/10064303/Coverage path planningpredator-prey modelreinforcement learningQ-learning algorithmmobile robot
spellingShingle	Meiyan Zhang Wenyu Cai Lingfeng Pang Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot IEEE Access Coverage path planning predator-prey model reinforcement learning Q-learning algorithm mobile robot
title	Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_full	Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_fullStr	Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_full_unstemmed	Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_short	Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_sort	predator prey reward based q learning coverage path planning for mobile robot
topic	Coverage path planning predator-prey model reinforcement learning Q-learning algorithm mobile robot
url	https://ieeexplore.ieee.org/document/10064303/
work_keys_str_mv	AT meiyanzhang predatorpreyrewardbasedqlearningcoveragepathplanningformobilerobot AT wenyucai predatorpreyrewardbasedqlearningcoveragepathplanningformobilerobot AT lingfengpang predatorpreyrewardbasedqlearningcoveragepathplanningformobilerobot

Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot

Similar Items