Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot

Coverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based coverage path planning algorithms are beginning to be explore...

Full description

Bibliographic Details
Main Authors: Meiyan Zhang, Wenyu Cai, Lingfeng Pang
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10064303/
_version_ 1797855316160806912
author Meiyan Zhang
Wenyu Cai
Lingfeng Pang
author_facet Meiyan Zhang
Wenyu Cai
Lingfeng Pang
author_sort Meiyan Zhang
collection DOAJ
description Coverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based coverage path planning algorithms are beginning to be explored recently. To overcome the problem of traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning of easily falling into local optimum, in this paper, the new-type reward functions originating from Predator-Prey model are introduced into traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP solution, which introduces a comprehensive reward function that incorporates three rewards including Predation Avoidance Reward Function, Smoothness Reward Function and Boundary Reward Function. In addition, the influence of weighting parameters on the total reward function is discussed. Extensive simulation results and practical experiments verify that the proposed Predator-Prey reward based <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning Coverage Path Planning (PP-<inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in short) has better performance than traditional BCD and <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in terms of repetition ratio and turns number.
first_indexed 2024-04-09T20:21:43Z
format Article
id doaj.art-26a76fd3eafb4d17985684a977c1aa76
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-09T20:21:43Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-26a76fd3eafb4d17985684a977c1aa762023-03-30T23:01:21ZengIEEEIEEE Access2169-35362023-01-0111296732968310.1109/ACCESS.2023.325500710064303Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile RobotMeiyan Zhang0https://orcid.org/0000-0002-0396-5786Wenyu Cai1https://orcid.org/0000-0002-8858-9221Lingfeng Pang2College of Electrical Engineering, Zhejiang University of Water Resources and Electric Power, Hangzhou, ChinaCollege of Electronics and Information, Hangzhou Dianzi University, Hangzhou, ChinaCollege of Electronics and Information, Hangzhou Dianzi University, Hangzhou, ChinaCoverage Path Planning (CPP in short) is a basic problem for mobile robot when facing a variety of applications. <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based coverage path planning algorithms are beginning to be explored recently. To overcome the problem of traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning of easily falling into local optimum, in this paper, the new-type reward functions originating from Predator-Prey model are introduced into traditional <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP solution, which introduces a comprehensive reward function that incorporates three rewards including Predation Avoidance Reward Function, Smoothness Reward Function and Boundary Reward Function. In addition, the influence of weighting parameters on the total reward function is discussed. Extensive simulation results and practical experiments verify that the proposed Predator-Prey reward based <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning Coverage Path Planning (PP-<inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in short) has better performance than traditional BCD and <inline-formula> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula>-Learning based CPP in terms of repetition ratio and turns number.https://ieeexplore.ieee.org/document/10064303/Coverage path planningpredator-prey modelreinforcement learningQ-learning algorithmmobile robot
spellingShingle Meiyan Zhang
Wenyu Cai
Lingfeng Pang
Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
IEEE Access
Coverage path planning
predator-prey model
reinforcement learning
Q-learning algorithm
mobile robot
title Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_full Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_fullStr Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_full_unstemmed Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_short Predator-Prey Reward Based Q-Learning Coverage Path Planning for Mobile Robot
title_sort predator prey reward based q learning coverage path planning for mobile robot
topic Coverage path planning
predator-prey model
reinforcement learning
Q-learning algorithm
mobile robot
url https://ieeexplore.ieee.org/document/10064303/
work_keys_str_mv AT meiyanzhang predatorpreyrewardbasedqlearningcoveragepathplanningformobilerobot
AT wenyucai predatorpreyrewardbasedqlearningcoveragepathplanningformobilerobot
AT lingfengpang predatorpreyrewardbasedqlearningcoveragepathplanningformobilerobot