APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning

Accuracy is the most critical factor that affects the effect of unmanned aerial vehicle (UAV) airdrop. The method to improve the accuracy of UAV airdrop based on traditional modeling has some limitations such as complex modeling, multiple model parameters and difficulty in considering all kinds of f...

Full description

Bibliographic Details
Main Authors: Yan Ouyang, Xinqing Wang, Ruizhe Hu, Honghui Xu
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9771405/
_version_ 1818253376130908160
author Yan Ouyang
Xinqing Wang
Ruizhe Hu
Honghui Xu
author_facet Yan Ouyang
Xinqing Wang
Ruizhe Hu
Honghui Xu
author_sort Yan Ouyang
collection DOAJ
description Accuracy is the most critical factor that affects the effect of unmanned aerial vehicle (UAV) airdrop. The method to improve the accuracy of UAV airdrop based on traditional modeling has some limitations such as complex modeling, multiple model parameters and difficulty in considering all kinds of factors comprehensively when facing complex realistic environment. In order to solve the problem of UAV precision airdrop more conveniently, this paper introduces the deep reinforcement learning method and proposes an Adaptive Priority Experience Replay Deep Double Q-Network (APER-DDQN) algorithm based on Deep Double Q-Network (DDQN). This method introduces the priority experience replay mechanism based on DDQN, and adopts adaptive discount rate and learning rate to improve the decision-making performance and stability of the algorithm. Furthermore, this paper designs and builds a simulation experimental platform for algorithm training and testing. The experimental results show that our APER-DDQN has good performance and can more effectively solve the problem of UAV accurate airdrop while avoiding the complex modeling process. Firstly, in the training stage, compared with DDQN and Deep Q Network (DQN), APER-DDQN has faster convergence speed, higher reward and more stable performance. Then, in the test phase, compared with relying on human experience, our method shows higher average reward (average 3.01) and success rate (average 41%), and our method also has more advantages in performance compared with DDQN and DQN. Finally, extended experiments verify the generalization ability of APER-DDQN to different environments.
first_indexed 2024-12-12T16:39:05Z
format Article
id doaj.art-6d37814f98a24dbe82e3d0224e571367
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-12T16:39:05Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-6d37814f98a24dbe82e3d0224e5713672022-12-22T00:18:36ZengIEEEIEEE Access2169-35362022-01-0110508785089110.1109/ACCESS.2022.31741059771405APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement LearningYan Ouyang0https://orcid.org/0000-0002-4481-9662Xinqing Wang1Ruizhe Hu2https://orcid.org/0000-0002-4891-9820Honghui Xu3https://orcid.org/0000-0002-0053-5676Department of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaDepartment of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaDepartment of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaDepartment of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaAccuracy is the most critical factor that affects the effect of unmanned aerial vehicle (UAV) airdrop. The method to improve the accuracy of UAV airdrop based on traditional modeling has some limitations such as complex modeling, multiple model parameters and difficulty in considering all kinds of factors comprehensively when facing complex realistic environment. In order to solve the problem of UAV precision airdrop more conveniently, this paper introduces the deep reinforcement learning method and proposes an Adaptive Priority Experience Replay Deep Double Q-Network (APER-DDQN) algorithm based on Deep Double Q-Network (DDQN). This method introduces the priority experience replay mechanism based on DDQN, and adopts adaptive discount rate and learning rate to improve the decision-making performance and stability of the algorithm. Furthermore, this paper designs and builds a simulation experimental platform for algorithm training and testing. The experimental results show that our APER-DDQN has good performance and can more effectively solve the problem of UAV accurate airdrop while avoiding the complex modeling process. Firstly, in the training stage, compared with DDQN and Deep Q Network (DQN), APER-DDQN has faster convergence speed, higher reward and more stable performance. Then, in the test phase, compared with relying on human experience, our method shows higher average reward (average 3.01) and success rate (average 41%), and our method also has more advantages in performance compared with DDQN and DQN. Finally, extended experiments verify the generalization ability of APER-DDQN to different environments.https://ieeexplore.ieee.org/document/9771405/UAV airdropdeep reinforcement learningdouble deep Q-networkpriority experience replay
spellingShingle Yan Ouyang
Xinqing Wang
Ruizhe Hu
Honghui Xu
APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
IEEE Access
UAV airdrop
deep reinforcement learning
double deep Q-network
priority experience replay
title APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
title_full APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
title_fullStr APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
title_full_unstemmed APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
title_short APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
title_sort aper ddqn uav precise airdrop method based on deep reinforcement learning
topic UAV airdrop
deep reinforcement learning
double deep Q-network
priority experience replay
url https://ieeexplore.ieee.org/document/9771405/
work_keys_str_mv AT yanouyang aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning
AT xinqingwang aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning
AT ruizhehu aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning
AT honghuixu aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning