AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
Collision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-11-01
|
Series: | Journal of Marine Science and Engineering |
Subjects: | |
Online Access: | https://www.mdpi.com/2077-1312/11/12/2258 |
_version_ | 1797380461229506560 |
---|---|
author | Jianya Yuan Mengxue Han Hongjian Wang Bo Zhong Wei Gao Dan Yu |
author_facet | Jianya Yuan Mengxue Han Hongjian Wang Bo Zhong Wei Gao Dan Yu |
author_sort | Jianya Yuan |
collection | DOAJ |
description | Collision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor to detect obstacles. The improved particle swarm optimization (I-PSO) algorithm is used to complete the path planning of the AUV under the known environment, and we use it as a benchmark to improve the fitness function and inertia weight of the algorithm. Traditional path-planning algorithms rely on accurate environment maps, where re-adapting the generated path can be highly demanding in terms of computational cost. We propose a deep reinforcement learning (DRL) algorithm based on collision avoidance tasks. The algorithm discussed in this paper takes into account the relative position of the target point and the rate of heading change from the previous timestep. Its reward function considers the target point, running time and turning angle at the same time. Compared with the LSTM structure, the Gated Recurrent Unit (GRU) network has fewer parameters, which helps to save training time. A series of simulation results show that the proposed deep deterministic policy gradient (DDPG) algorithm can obtain excellent results in simple and complex environments. |
first_indexed | 2024-03-08T20:37:35Z |
format | Article |
id | doaj.art-2998580de9aa47dfb460b08c8314ff99 |
institution | Directory Open Access Journal |
issn | 2077-1312 |
language | English |
last_indexed | 2024-03-08T20:37:35Z |
publishDate | 2023-11-01 |
publisher | MDPI AG |
record_format | Article |
series | Journal of Marine Science and Engineering |
spelling | doaj.art-2998580de9aa47dfb460b08c8314ff992023-12-22T14:18:43ZengMDPI AGJournal of Marine Science and Engineering2077-13122023-11-011112225810.3390/jmse11122258AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy GradientJianya Yuan0Mengxue Han1Hongjian Wang2Bo Zhong3Wei Gao4Dan Yu5College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaAVIC China Aero-Polytechnology Establishment, Beijing 100000, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor to detect obstacles. The improved particle swarm optimization (I-PSO) algorithm is used to complete the path planning of the AUV under the known environment, and we use it as a benchmark to improve the fitness function and inertia weight of the algorithm. Traditional path-planning algorithms rely on accurate environment maps, where re-adapting the generated path can be highly demanding in terms of computational cost. We propose a deep reinforcement learning (DRL) algorithm based on collision avoidance tasks. The algorithm discussed in this paper takes into account the relative position of the target point and the rate of heading change from the previous timestep. Its reward function considers the target point, running time and turning angle at the same time. Compared with the LSTM structure, the Gated Recurrent Unit (GRU) network has fewer parameters, which helps to save training time. A series of simulation results show that the proposed deep deterministic policy gradient (DDPG) algorithm can obtain excellent results in simple and complex environments.https://www.mdpi.com/2077-1312/11/12/2258autonomous underwater vehicle (AUV)particle swarm optimization (PSO)collision avoidance planningdeep deterministic policy gradient (DDPG) |
spellingShingle | Jianya Yuan Mengxue Han Hongjian Wang Bo Zhong Wei Gao Dan Yu AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient Journal of Marine Science and Engineering autonomous underwater vehicle (AUV) particle swarm optimization (PSO) collision avoidance planning deep deterministic policy gradient (DDPG) |
title | AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient |
title_full | AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient |
title_fullStr | AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient |
title_full_unstemmed | AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient |
title_short | AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient |
title_sort | auv collision avoidance planning method based on deep deterministic policy gradient |
topic | autonomous underwater vehicle (AUV) particle swarm optimization (PSO) collision avoidance planning deep deterministic policy gradient (DDPG) |
url | https://www.mdpi.com/2077-1312/11/12/2258 |
work_keys_str_mv | AT jianyayuan auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient AT mengxuehan auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient AT hongjianwang auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient AT bozhong auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient AT weigao auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient AT danyu auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient |