COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy

This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions a...

Full description

Bibliographic Details
Main Authors: Naifeng Wen, Yundong Long, Rubo Zhang, Guanqun Liu, Wenjie Wan, Dian Jiao
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Journal of Marine Science and Engineering
Subjects:
Online Access:https://www.mdpi.com/2077-1312/11/12/2334
_version_ 1797380390879494144
author Naifeng Wen
Yundong Long
Rubo Zhang
Guanqun Liu
Wenjie Wan
Dian Jiao
author_facet Naifeng Wen
Yundong Long
Rubo Zhang
Guanqun Liu
Wenjie Wan
Dian Jiao
author_sort Naifeng Wen
collection DOAJ
description This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.
first_indexed 2024-03-08T20:37:32Z
format Article
id doaj.art-1e20208442bc4e4782848c33c9ef1bba
institution Directory Open Access Journal
issn 2077-1312
language English
last_indexed 2024-03-08T20:37:32Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Journal of Marine Science and Engineering
spelling doaj.art-1e20208442bc4e4782848c33c9ef1bba2023-12-22T14:18:58ZengMDPI AGJournal of Marine Science and Engineering2077-13122023-12-011112233410.3390/jmse11122334COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning StrategyNaifeng Wen0Yundong Long1Rubo Zhang2Guanqun Liu3Wenjie Wan4Dian Jiao5College of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaThis research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.https://www.mdpi.com/2077-1312/11/12/2334COLREGsUSV cooperative path planningmulti-agent proximal policy optimizationdeep learningtarget detection
spellingShingle Naifeng Wen
Yundong Long
Rubo Zhang
Guanqun Liu
Wenjie Wan
Dian Jiao
COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
Journal of Marine Science and Engineering
COLREGs
USV cooperative path planning
multi-agent proximal policy optimization
deep learning
target detection
title COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_full COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_fullStr COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_full_unstemmed COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_short COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_sort colregs based path planning for usvs using the deep reinforcement learning strategy
topic COLREGs
USV cooperative path planning
multi-agent proximal policy optimization
deep learning
target detection
url https://www.mdpi.com/2077-1312/11/12/2334
work_keys_str_mv AT naifengwen colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy
AT yundonglong colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy
AT rubozhang colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy
AT guanqunliu colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy
AT wenjiewan colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy
AT dianjiao colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy