Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments
Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars e...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/23/5494 |
_version_ | 1797399671854858240 |
---|---|
author | Jiantao Li Tianxian Zhang Kai Liu |
author_facet | Jiantao Li Tianxian Zhang Kai Liu |
author_sort | Jiantao Li |
collection | DOAJ |
description | Unmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability. |
first_indexed | 2024-03-09T01:44:28Z |
format | Article |
id | doaj.art-e017ec3d5b6f43a28ac74d76d3114399 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-09T01:44:28Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-e017ec3d5b6f43a28ac74d76d31143992023-12-08T15:24:48ZengMDPI AGRemote Sensing2072-42922023-11-011523549410.3390/rs15235494Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown EnvironmentsJiantao Li0Tianxian Zhang1Kai Liu2School of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaSchool of Information and Communication Engineering, University of Electronic Science and Technology of China, Chengdu 611731, ChinaUnmanned combat aerial vehicle (UCAV) trajectory planning to avoid radar detection threats is a complicated optimization problem that has been widely studied. The rapid changes in Radar Cross Sections (RCSs), the unknown cruise trajectory of airborne radar, and the uncertain distribution of radars exacerbate the complexity of this problem. In this paper, we propose a novel UCAV trajectory planning method based on deep reinforcement learning (DRL) technology to overcome the adverse impacts caused by the dynamics and randomness of environments. A predictive control model is constructed to describe the dynamic characteristics of the UCAV trajectory planning problem in detail. To improve the UCAV’s predictive ability, we propose a memory-enhanced twin delayed deep deterministic policy gradient (ME-TD3) algorithm that uses an attention mechanism to effectively extract environmental patterns from historical information. The simulation results show that the proposed method can successfully train UCAVs to carry out trajectory planning tasks in dynamic and unknown environments. Furthermore, the ME-TD3 algorithm outperforms other classical DRL algorithms in UCAV trajectory planning, exhibiting superior performance and adaptability.https://www.mdpi.com/2072-4292/15/23/5494UCAVtrajectory planningpredictive control modelmemory-enhanced twin delayed deep deterministic policy gradient |
spellingShingle | Jiantao Li Tianxian Zhang Kai Liu Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments Remote Sensing UCAV trajectory planning predictive control model memory-enhanced twin delayed deep deterministic policy gradient |
title | Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments |
title_full | Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments |
title_fullStr | Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments |
title_full_unstemmed | Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments |
title_short | Memory-Enhanced Twin Delayed Deep Deterministic Policy Gradient (ME-TD3)-Based Unmanned Combat Aerial Vehicle Trajectory Planning for Avoiding Radar Detection Threats in Dynamic and Unknown Environments |
title_sort | memory enhanced twin delayed deep deterministic policy gradient me td3 based unmanned combat aerial vehicle trajectory planning for avoiding radar detection threats in dynamic and unknown environments |
topic | UCAV trajectory planning predictive control model memory-enhanced twin delayed deep deterministic policy gradient |
url | https://www.mdpi.com/2072-4292/15/23/5494 |
work_keys_str_mv | AT jiantaoli memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments AT tianxianzhang memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments AT kailiu memoryenhancedtwindelayeddeepdeterministicpolicygradientmetd3basedunmannedcombataerialvehicletrajectoryplanningforavoidingradardetectionthreatsindynamicandunknownenvironments |