COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy

This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions a...

Full description

Bibliographic Details
Main Authors:	Naifeng Wen, Yundong Long, Rubo Zhang, Guanqun Liu, Wenjie Wan, Dian Jiao
Format:	Article
Language:	English
Published:	MDPI AG 2023-12-01
Series:	Journal of Marine Science and Engineering
Subjects:	COLREGs USV cooperative path planning multi-agent proximal policy optimization deep learning target detection
Online Access:	https://www.mdpi.com/2077-1312/11/12/2334

_version_	1797380390879494144
author	Naifeng Wen Yundong Long Rubo Zhang Guanqun Liu Wenjie Wan Dian Jiao
author_facet	Naifeng Wen Yundong Long Rubo Zhang Guanqun Liu Wenjie Wan Dian Jiao
author_sort	Naifeng Wen
collection	DOAJ
description	This research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.
first_indexed	2024-03-08T20:37:32Z
format	Article
id	doaj.art-1e20208442bc4e4782848c33c9ef1bba
institution	Directory Open Access Journal
issn	2077-1312
language	English
last_indexed	2024-03-08T20:37:32Z
publishDate	2023-12-01
publisher	MDPI AG
record_format	Article
series	Journal of Marine Science and Engineering
spelling	doaj.art-1e20208442bc4e4782848c33c9ef1bba2023-12-22T14:18:58ZengMDPI AGJournal of Marine Science and Engineering2077-13122023-12-011112233410.3390/jmse11122334COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning StrategyNaifeng Wen0Yundong Long1Rubo Zhang2Guanqun Liu3Wenjie Wan4Dian Jiao5College of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaCollege of Mechanical and Electronic Engineering, Dalian Minzu University, Dalian 116600, ChinaThis research introduces a two-stage deep reinforcement learning approach for the cooperative path planning of unmanned surface vehicles (USVs). The method is designed to address cooperative collision-avoidance path planning while adhering to the International Regulations for Preventing Collisions at Sea (COLREGs) and considering the collision-avoidance problem within the USV fleet and between USVs and target ships (TSs). To achieve this, the study presents a dual COLREGs-compliant action-selection strategy to effectively manage the vessel-avoidance problem. Firstly, we construct a COLREGs-compliant action-evaluation network that utilizes a deep learning network trained on pre-recorded TS avoidance trajectories by USVs in compliance with COLREGs. Then, the COLREGs-compliant reward-function-based action-selection network is proposed by considering various TS encountering scenarios. Consequently, the results of the two networks are fused to select actions for cooperative path-planning processes. The path-planning model is established using the multi-agent proximal policy optimization (MAPPO) method. The action space, observation space, and reward function are tailored for the policy network. Additionally, a TS detection method is introduced to detect the motion intentions of TSs. The study conducted Monte Carlo simulations to demonstrate the strong performance of the planning method. Furthermore, experiments focusing on COLREGs-based TS avoidance were carried out to validate the feasibility of the approach. The proposed TS detection model exhibited robust performance within the defined task.https://www.mdpi.com/2077-1312/11/12/2334COLREGsUSV cooperative path planningmulti-agent proximal policy optimizationdeep learningtarget detection
spellingShingle	Naifeng Wen Yundong Long Rubo Zhang Guanqun Liu Wenjie Wan Dian Jiao COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy Journal of Marine Science and Engineering COLREGs USV cooperative path planning multi-agent proximal policy optimization deep learning target detection
title	COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_full	COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_fullStr	COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_full_unstemmed	COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_short	COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy
title_sort	colregs based path planning for usvs using the deep reinforcement learning strategy
topic	COLREGs USV cooperative path planning multi-agent proximal policy optimization deep learning target detection
url	https://www.mdpi.com/2077-1312/11/12/2334
work_keys_str_mv	AT naifengwen colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT yundonglong colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT rubozhang colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT guanqunliu colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT wenjiewan colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy AT dianjiao colregsbasedpathplanningforusvsusingthedeepreinforcementlearningstrategy

COLREGs-Based Path Planning for USVs Using the Deep Reinforcement Learning Strategy

Similar Items