COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions a...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-12-01
|
Series: | Journal of Marine Science and Engineering |
Subjects: | |
Online Access: | https://www.mdpi.com/2077-1312/11/12/2334 |
_version_ | 1797380390879494144 |
---|---|
author | Naifeng Wen Yundong Long Rubo Zhang Guanqun Liu Wenjie Wan Dian Jiao |
author_facet | Naifeng Wen Yundong Long Rubo Zhang Guanqun Liu Wenjie Wan Dian Jiao |
author_sort | Naifeng Wen |
collection | DOAJ |
description | This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task. |
first_indexed | 2024-03-08T20:37:32Z |
format | Article |
id | doaj.art-1e20208442bc4e4782848c33c9ef1bba |
institution | Directory Open Access Journal |
issn | 2077-1312 |
language | English |
last_indexed | 2024-03-08T20:37:32Z |
publishDate | 2023-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Journal of Marine Science and Engineering |
spelling | doaj.art-1e20208442bc4e4782848c33c9ef1bba2023-12-22T14:18:58ZengMDPI AGJournal of Marine Science and Engineering2077-13122023-12-011112233410.3390/jmse11122334COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning StrategyNaifeng Wen0Yundong Long1Rubo Zhang2Guanqun Liu3Wenjie Wan4Dian Jiao5College of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaThis research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.https://www.mdpi.com/2077-1312/11/12/2334COLREGsUSV cooperative path planningmulti-agent proximal policy optimizationdeep learningtarget detection |
spellingShingle | Naifeng Wen Yundong Long Rubo Zhang Guanqun Liu Wenjie Wan Dian Jiao COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy Journal of Marine Science and Engineering COLREGs USV cooperative path planning multi-agent proximal policy optimization deep learning target detection |
title | COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy |
title_full | COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy |
title_fullStr | COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy |
title_full_unstemmed | COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy |
title_short | COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy |
title_sort | colregs based path planning for usvs using the deep reinforcement learning strategy |
topic | COLREGs USV cooperative path planning multi-agent proximal policy optimization deep learning target detection |
url | https://www.mdpi.com/2077-1312/11/12/2334 |
work_keys_str_mv | AT naifengwen colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT yundonglong colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT rubozhang colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT guanqunliu colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT wenjiewan colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT dianjiao colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy |