AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient

Collision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor...

Full description

Bibliographic Details
Main Authors: Jianya Yuan, Mengxue Han, Hongjian Wang, Bo Zhong, Wei Gao, Dan Yu
Format: Article
Language:English
Published: MDPI AG 2023-11-01
Series:Journal of Marine Science and Engineering
Subjects:
Online Access:https://www.mdpi.com/2077-1312/11/12/2258
_version_ 1797380461229506560
author Jianya Yuan
Mengxue Han
Hongjian Wang
Bo Zhong
Wei Gao
Dan Yu
author_facet Jianya Yuan
Mengxue Han
Hongjian Wang
Bo Zhong
Wei Gao
Dan Yu
author_sort Jianya Yuan
collection DOAJ
description Collision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor to detect obstacles. The improved particle swarm optimization (I-PSO) algorithm is used to complete the path planning of the AUV under the known environment, and we use it as a benchmark to improve the fitness function and inertia weight of the algorithm. Traditional path-planning algorithms rely on accurate environment maps, where re-adapting the generated path can be highly demanding in terms of computational cost. We propose a deep reinforcement learning (DRL) algorithm based on collision avoidance tasks. The algorithm discussed in this paper takes into account the relative position of the target point and the rate of heading change from the previous timestep. Its reward function considers the target point, running time and turning angle at the same time. Compared with the LSTM structure, the Gated Recurrent Unit (GRU) network has fewer parameters, which helps to save training time. A series of simulation results show that the proposed deep deterministic policy gradient (DDPG) algorithm can obtain excellent results in simple and complex environments.
first_indexed 2024-03-08T20:37:35Z
format Article
id doaj.art-2998580de9aa47dfb460b08c8314ff99
institution Directory Open Access Journal
issn 2077-1312
language English
last_indexed 2024-03-08T20:37:35Z
publishDate 2023-11-01
publisher MDPI AG
record_format Article
series Journal of Marine Science and Engineering
spelling doaj.art-2998580de9aa47dfb460b08c8314ff992023-12-22T14:18:43ZengMDPI AGJournal of Marine Science and Engineering2077-13122023-11-011112225810.3390/jmse11122258AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy GradientJianya Yuan0Mengxue Han1Hongjian Wang2Bo Zhong3Wei Gao4Dan Yu5College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaAVIC China Aero-Polytechnology Establishment, Beijing 100000, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 045100, ChinaCollision avoidance planning has always been a hot and important issue in the field of unmanned aircraft research. In this article, we describe an online collision avoidance planning algorithm for autonomous underwater vehicle (AUV) autonomous navigation, which relies on its own active sonar sensor to detect obstacles. The improved particle swarm optimization (I-PSO) algorithm is used to complete the path planning of the AUV under the known environment, and we use it as a benchmark to improve the fitness function and inertia weight of the algorithm. Traditional path-planning algorithms rely on accurate environment maps, where re-adapting the generated path can be highly demanding in terms of computational cost. We propose a deep reinforcement learning (DRL) algorithm based on collision avoidance tasks. The algorithm discussed in this paper takes into account the relative position of the target point and the rate of heading change from the previous timestep. Its reward function considers the target point, running time and turning angle at the same time. Compared with the LSTM structure, the Gated Recurrent Unit (GRU) network has fewer parameters, which helps to save training time. A series of simulation results show that the proposed deep deterministic policy gradient (DDPG) algorithm can obtain excellent results in simple and complex environments.https://www.mdpi.com/2077-1312/11/12/2258autonomous underwater vehicle (AUV)particle swarm optimization (PSO)collision avoidance planningdeep deterministic policy gradient (DDPG)
spellingShingle Jianya Yuan
Mengxue Han
Hongjian Wang
Bo Zhong
Wei Gao
Dan Yu
AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
Journal of Marine Science and Engineering
autonomous underwater vehicle (AUV)
particle swarm optimization (PSO)
collision avoidance planning
deep deterministic policy gradient (DDPG)
title AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
title_full AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
title_fullStr AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
title_full_unstemmed AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
title_short AUV Collision Avoidance Planning Method Based on Deep Deterministic Policy Gradient
title_sort auv collision avoidance planning method based on deep deterministic policy gradient
topic autonomous underwater vehicle (AUV)
particle swarm optimization (PSO)
collision avoidance planning
deep deterministic policy gradient (DDPG)
url https://www.mdpi.com/2077-1312/11/12/2258
work_keys_str_mv AT jianyayuan auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient
AT mengxuehan auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient
AT hongjianwang auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient
AT bozhong auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient
AT weigao auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient
AT danyu auvcollisionavoidanceplanningmethodbasedondeepdeterministicpolicygradient