Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments

Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars e...

Full description

Bibliographic Details
Main Authors:	Jiantao Li, Tianxian Zhang, Kai Liu
Format:	Article
Language:	English
Published:	MDPI AG 2023-11-01
Series:	Remote Sensing
Subjects:	UCAV trajectory planning predictive control model memory-enhanced twin delayed deep deterministic policy gradient
Online Access:	https://www.mdpi.com/2072-4292/15/23/5494

_version_	1797399671854858240
author	Jiantao Li Tianxian Zhang Kai Liu
author_facet	Jiantao Li Tianxian Zhang Kai Liu
author_sort	Jiantao Li
collection	DOAJ
description	Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.
first_indexed	2024-03-09T01:44:28Z
format	Article
id	doaj.art-e017ec3d5b6f43a28ac74d76d3114399
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-09T01:44:28Z
publishDate	2023-11-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-e017ec3d5b6f43a28ac74d76d31143992023-12-08T15:24:48ZengMDPI AGRemote Sensing2072-42922023-11-011523549410.3390/rs15235494Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown EnvironmentsJiantao Li0Tianxian Zhang1Kai Liu2School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaUnmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.https://www.mdpi.com/2072-4292/15/23/5494UCAVtrajectory planningpredictive control modelmemory-enhanced twin delayed deep deterministic policy gradient
spellingShingle	Jiantao Li Tianxian Zhang Kai Liu Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments Remote Sensing UCAV trajectory planning predictive control model memory-enhanced twin delayed deep deterministic policy gradient
title	Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_full	Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_fullStr	Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_full_unstemmed	Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_short	Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_sort	memory enhanced twin delayed deep deterministic policy gradient me td3 based unmanned combat aerial vehicle trajectory planning for avoiding radar detection threats in dynamic and unknown environments
topic	UCAV trajectory planning predictive control model memory-enhanced twin delayed deep deterministic policy gradient
url	https://www.mdpi.com/2072-4292/15/23/5494
work_keys_str_mv	AT jiantaoli memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments AT tianxianzhang memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments AT kailiu memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments

Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments

Similar Items