Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments

Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars e...

Full description

Bibliographic Details
Main Authors: Jiantao Li, Tianxian Zhang, Kai Liu
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/23/5494
_version_ 1797399671854858240
author Jiantao Li
Tianxian Zhang
Kai Liu
author_facet Jiantao Li
Tianxian Zhang
Kai Liu
author_sort Jiantao Li
collection DOAJ
description Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.
first_indexed 2024-03-09T01:44:28Z
format Article
id doaj.art-e017ec3d5b6f43a28ac74d76d3114399
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T01:44:28Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-e017ec3d5b6f43a28ac74d76d31143992023-12-08T15:24:48ZengMDPI AGRemote Sensing2072-42922023-11-011523549410.3390/rs15235494Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown EnvironmentsJiantao Li0Tianxian Zhang1Kai Liu2School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaUnmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.https://www.mdpi.com/2072-4292/15/23/5494UCAVtrajectory planningpredictive control modelmemory-enhanced twin delayed deep deterministic policy gradient
spellingShingle Jiantao Li
Tianxian Zhang
Kai Liu
Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
Remote Sensing
UCAV
trajectory planning
predictive control model
memory-enhanced twin delayed deep deterministic policy gradient
title Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_full Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_fullStr Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_full_unstemmed Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_short Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
title_sort memory enhanced twin delayed deep deterministic policy gradient me td3 based unmanned combat aerial vehicle trajectory planning for avoiding radar detection threats in dynamic and unknown environments
topic UCAV
trajectory planning
predictive control model
memory-enhanced twin delayed deep deterministic policy gradient
url https://www.mdpi.com/2072-4292/15/23/5494
work_keys_str_mv AT jiantaoli memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments
AT tianxianzhang memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments
AT kailiu memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments