Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method

Unmanned aerial vehicles (UAVs) are important in reconnaissance missions because of their flexibility and convenience. Vitally, UAVs are capable of autonomous navigation, which means they can be used to plan safe paths to target positions in dangerous surroundings. Traditional path-planning algorith...

Full description

Bibliographic Details
Main Authors:	Yu Chen, Qi Dong, Xiaozhou Shang, Zhenyu Wu, Jinyu Wang
Format:	Article
Language:	English
Published:	MDPI AG 2022-12-01
Series:	Drones
Subjects:	multi-UAV path planning incomplete information multi-objective reinforcement learning
Online Access:	https://www.mdpi.com/2504-446X/7/1/10

_version_	1827626701153107968
author	Yu Chen Qi Dong Xiaozhou Shang Zhenyu Wu Jinyu Wang
author_facet	Yu Chen Qi Dong Xiaozhou Shang Zhenyu Wu Jinyu Wang
author_sort	Yu Chen
collection	DOAJ
description	Unmanned aerial vehicles (UAVs) are important in reconnaissance missions because of their flexibility and convenience. Vitally, UAVs are capable of autonomous navigation, which means they can be used to plan safe paths to target positions in dangerous surroundings. Traditional path-planning algorithms do not perform well when the environmental state is dynamic and partially observable. It is difficult for a UAV to make the correct decision with incomplete information. In this study, we proposed a multi-UAV path planning algorithm based on multi-agent reinforcement learning which entails the adoption of centralized training–decentralized execution architecture to coordinate all the UAVs. Additionally, we introduced a hidden state of the recurrent neural network to utilize the historical observation information. To solve the multi-objective optimization problem, We designed a joint reward function to guide UAVs to learn optimal policies under the multiple constraints. The results demonstrate that by using our method, we were able to solve the problem of incomplete information and low efficiency caused by partial observations and sparse rewards in reinforcement learning, and we realized kdiff multi-UAV cooperative autonomous path planning in unknown environment.
first_indexed	2024-03-09T13:01:17Z
format	Article
id	doaj.art-e8b74384f6e64b709ddf3ff1d94ff758
institution	Directory Open Access Journal
issn	2504-446X
language	English
last_indexed	2024-03-09T13:01:17Z
publishDate	2022-12-01
publisher	MDPI AG
record_format	Article
series	Drones
spelling	doaj.art-e8b74384f6e64b709ddf3ff1d94ff7582023-11-30T21:55:02ZengMDPI AGDrones2504-446X2022-12-01711010.3390/drones7010010Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning MethodYu Chen0Qi Dong1Xiaozhou Shang2Zhenyu Wu3Jinyu Wang4Institute of Advanced Technology, University of Science and Technology of China, Hefei 230026, ChinaInstitute of Advanced Technology, University of Science and Technology of China, Hefei 230026, ChinaChina Academy of Electronics and Information Technology, Beijing 100049, ChinaSchool of Information and Electronics, Beijing Institute of Technology, Beijing 100081, ChinaSchool of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, ChinaUnmanned aerial vehicles (UAVs) are important in reconnaissance missions because of their flexibility and convenience. Vitally, UAVs are capable of autonomous navigation, which means they can be used to plan safe paths to target positions in dangerous surroundings. Traditional path-planning algorithms do not perform well when the environmental state is dynamic and partially observable. It is difficult for a UAV to make the correct decision with incomplete information. In this study, we proposed a multi-UAV path planning algorithm based on multi-agent reinforcement learning which entails the adoption of centralized training–decentralized execution architecture to coordinate all the UAVs. Additionally, we introduced a hidden state of the recurrent neural network to utilize the historical observation information. To solve the multi-objective optimization problem, We designed a joint reward function to guide UAVs to learn optimal policies under the multiple constraints. The results demonstrate that by using our method, we were able to solve the problem of incomplete information and low efficiency caused by partial observations and sparse rewards in reinforcement learning, and we realized kdiff multi-UAV cooperative autonomous path planning in unknown environment.https://www.mdpi.com/2504-446X/7/1/10multi-UAVpath planningincomplete informationmulti-objectivereinforcement learning
spellingShingle	Yu Chen Qi Dong Xiaozhou Shang Zhenyu Wu Jinyu Wang Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method Drones multi-UAV path planning incomplete information multi-objective reinforcement learning
title	Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method
title_full	Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method
title_fullStr	Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method
title_full_unstemmed	Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method
title_short	Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method
title_sort	multi uav autonomous path planning in reconnaissance missions considering incomplete information a reinforcement learning method
topic	multi-UAV path planning incomplete information multi-objective reinforcement learning
url	https://www.mdpi.com/2504-446X/7/1/10
work_keys_str_mv	AT yuchen multiuavautonomouspathplanninginreconnaissancemissionsconsideringincompleteinformationareinforcementlearningmethod AT qidong multiuavautonomouspathplanninginreconnaissancemissionsconsideringincompleteinformationareinforcementlearningmethod AT xiaozhoushang multiuavautonomouspathplanninginreconnaissancemissionsconsideringincompleteinformationareinforcementlearningmethod AT zhenyuwu multiuavautonomouspathplanninginreconnaissancemissionsconsideringincompleteinformationareinforcementlearningmethod AT jinyuwang multiuavautonomouspathplanninginreconnaissancemissionsconsideringincompleteinformationareinforcementlearningmethod

Multi-UAV Autonomous Path Planning in Reconnaissance Missions Considering Incomplete Information: A Reinforcement Learning Method

Similar Items