APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning
Accuracy is the most critical factor that affects the effect of unmanned aerial vehicle (UAV) airdrop. The method to improve the accuracy of UAV airdrop based on traditional modeling has some limitations such as complex modeling, multiple model parameters and difficulty in considering all kinds of f...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9771405/ |
_version_ | 1818253376130908160 |
---|---|
author | Yan Ouyang Xinqing Wang Ruizhe Hu Honghui Xu |
author_facet | Yan Ouyang Xinqing Wang Ruizhe Hu Honghui Xu |
author_sort | Yan Ouyang |
collection | DOAJ |
description | Accuracy is the most critical factor that affects the effect of unmanned aerial vehicle (UAV) airdrop. The method to improve the accuracy of UAV airdrop based on traditional modeling has some limitations such as complex modeling, multiple model parameters and difficulty in considering all kinds of factors comprehensively when facing complex realistic environment. In order to solve the problem of UAV precision airdrop more conveniently, this paper introduces the deep reinforcement learning method and proposes an Adaptive Priority Experience Replay Deep Double Q-Network (APER-DDQN) algorithm based on Deep Double Q-Network (DDQN). This method introduces the priority experience replay mechanism based on DDQN, and adopts adaptive discount rate and learning rate to improve the decision-making performance and stability of the algorithm. Furthermore, this paper designs and builds a simulation experimental platform for algorithm training and testing. The experimental results show that our APER-DDQN has good performance and can more effectively solve the problem of UAV accurate airdrop while avoiding the complex modeling process. Firstly, in the training stage, compared with DDQN and Deep Q Network (DQN), APER-DDQN has faster convergence speed, higher reward and more stable performance. Then, in the test phase, compared with relying on human experience, our method shows higher average reward (average 3.01) and success rate (average 41%), and our method also has more advantages in performance compared with DDQN and DQN. Finally, extended experiments verify the generalization ability of APER-DDQN to different environments. |
first_indexed | 2024-12-12T16:39:05Z |
format | Article |
id | doaj.art-6d37814f98a24dbe82e3d0224e571367 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-12-12T16:39:05Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-6d37814f98a24dbe82e3d0224e5713672022-12-22T00:18:36ZengIEEEIEEE Access2169-35362022-01-0110508785089110.1109/ACCESS.2022.31741059771405APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement LearningYan Ouyang0https://orcid.org/0000-0002-4481-9662Xinqing Wang1Ruizhe Hu2https://orcid.org/0000-0002-4891-9820Honghui Xu3https://orcid.org/0000-0002-0053-5676Department of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaDepartment of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaDepartment of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaDepartment of Mechanical Engineering, College of Field Engineering, Army Engineering University, Nanjing, ChinaAccuracy is the most critical factor that affects the effect of unmanned aerial vehicle (UAV) airdrop. The method to improve the accuracy of UAV airdrop based on traditional modeling has some limitations such as complex modeling, multiple model parameters and difficulty in considering all kinds of factors comprehensively when facing complex realistic environment. In order to solve the problem of UAV precision airdrop more conveniently, this paper introduces the deep reinforcement learning method and proposes an Adaptive Priority Experience Replay Deep Double Q-Network (APER-DDQN) algorithm based on Deep Double Q-Network (DDQN). This method introduces the priority experience replay mechanism based on DDQN, and adopts adaptive discount rate and learning rate to improve the decision-making performance and stability of the algorithm. Furthermore, this paper designs and builds a simulation experimental platform for algorithm training and testing. The experimental results show that our APER-DDQN has good performance and can more effectively solve the problem of UAV accurate airdrop while avoiding the complex modeling process. Firstly, in the training stage, compared with DDQN and Deep Q Network (DQN), APER-DDQN has faster convergence speed, higher reward and more stable performance. Then, in the test phase, compared with relying on human experience, our method shows higher average reward (average 3.01) and success rate (average 41%), and our method also has more advantages in performance compared with DDQN and DQN. Finally, extended experiments verify the generalization ability of APER-DDQN to different environments.https://ieeexplore.ieee.org/document/9771405/UAV airdropdeep reinforcement learningdouble deep Q-networkpriority experience replay |
spellingShingle | Yan Ouyang Xinqing Wang Ruizhe Hu Honghui Xu APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning IEEE Access UAV airdrop deep reinforcement learning double deep Q-network priority experience replay |
title | APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning |
title_full | APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning |
title_fullStr | APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning |
title_full_unstemmed | APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning |
title_short | APER-DDQN: UAV Precise Airdrop Method Based on Deep Reinforcement Learning |
title_sort | aper ddqn uav precise airdrop method based on deep reinforcement learning |
topic | UAV airdrop deep reinforcement learning double deep Q-network priority experience replay |
url | https://ieeexplore.ieee.org/document/9771405/ |
work_keys_str_mv | AT yanouyang aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning AT xinqingwang aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning AT ruizhehu aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning AT honghuixu aperddqnuavpreciseairdropmethodbasedondeepreinforcementlearning |