Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller

This work presents a framework that allows Unmanned Surface Vehicles (USVs) to avoid dynamic obstacles through initial training on an Unmanned Ground Vehicle (UGV) and cross-domain retraining on a USV. This is achieved by integrating a Deep Reinforcement Learning (DRL) agent that generates high-leve...

Full description

Bibliographic Details
Main Authors:	Jianwen Li, Jalil Chavez-Galaviz, Kamyar Azizzadenesheli, Nina Mahmoudian
Format:	Article
Language:	English
Published:	MDPI AG 2023-03-01
Series:	Sensors
Subjects:	unmanned surface vehicle deep reinforcement learning collision avoidance model predictive control
Online Access:	https://www.mdpi.com/1424-8220/23/7/3572

_version_	1797607021319553024
author	Jianwen Li Jalil Chavez-Galaviz Kamyar Azizzadenesheli Nina Mahmoudian
author_facet	Jianwen Li Jalil Chavez-Galaviz Kamyar Azizzadenesheli Nina Mahmoudian
author_sort	Jianwen Li
collection	DOAJ
description	This work presents a framework that allows Unmanned Surface Vehicles (USVs) to avoid dynamic obstacles through initial training on an Unmanned Ground Vehicle (UGV) and cross-domain retraining on a USV. This is achieved by integrating a Deep Reinforcement Learning (DRL) agent that generates high-level control commands and leveraging a neural network based model predictive controller (NN-MPC) to reach target waypoints and reject disturbances. A Deep Q Network (DQN) utilized in this framework is trained in a ground environment using a Turtlebot robot and retrained in a water environment using the BREAM USV in the Gazebo simulator to avoid dynamic obstacles. The network is then validated in both simulation and real-world tests. The cross-domain learning largely decreases the training time (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>28</mn><mo>%</mo></mrow></semantics></math></inline-formula>) and increases the obstacle avoidance performance (70 more reward points) compared to pure water domain training. This methodology shows that it is possible to leverage the data-rich and accessible ground environments to train DRL agent in data-poor and difficult-to-access marine environments. This will allow rapid and iterative agent development without further training due to the change in environment or vehicle dynamics.
first_indexed	2024-03-11T05:24:27Z
format	Article
id	doaj.art-0ac5030034084f5c98f4c055685600a0
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T05:24:27Z
publishDate	2023-03-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-0ac5030034084f5c98f4c055685600a02023-11-17T17:34:33ZengMDPI AGSensors1424-82202023-03-01237357210.3390/s23073572Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive ControllerJianwen Li0Jalil Chavez-Galaviz1Kamyar Azizzadenesheli2Nina Mahmoudian3The School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USAThe School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USANvidia Corporation, Santa Clara, CA 95051, USAThe School of Mechanical Engineering, Purdue University, West Lafayette, IN 47907, USAThis work presents a framework that allows Unmanned Surface Vehicles (USVs) to avoid dynamic obstacles through initial training on an Unmanned Ground Vehicle (UGV) and cross-domain retraining on a USV. This is achieved by integrating a Deep Reinforcement Learning (DRL) agent that generates high-level control commands and leveraging a neural network based model predictive controller (NN-MPC) to reach target waypoints and reject disturbances. A Deep Q Network (DQN) utilized in this framework is trained in a ground environment using a Turtlebot robot and retrained in a water environment using the BREAM USV in the Gazebo simulator to avoid dynamic obstacles. The network is then validated in both simulation and real-world tests. The cross-domain learning largely decreases the training time (<inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>28</mn><mo>%</mo></mrow></semantics></math></inline-formula>) and increases the obstacle avoidance performance (70 more reward points) compared to pure water domain training. This methodology shows that it is possible to leverage the data-rich and accessible ground environments to train DRL agent in data-poor and difficult-to-access marine environments. This will allow rapid and iterative agent development without further training due to the change in environment or vehicle dynamics.https://www.mdpi.com/1424-8220/23/7/3572unmanned surface vehicledeep reinforcement learningcollision avoidancemodel predictive control
spellingShingle	Jianwen Li Jalil Chavez-Galaviz Kamyar Azizzadenesheli Nina Mahmoudian Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller Sensors unmanned surface vehicle deep reinforcement learning collision avoidance model predictive control
title	Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller
title_full	Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller
title_fullStr	Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller
title_full_unstemmed	Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller
title_short	Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller
title_sort	dynamic obstacle avoidance for usvs using cross domain deep reinforcement learning and neural network model predictive controller
topic	unmanned surface vehicle deep reinforcement learning collision avoidance model predictive control
url	https://www.mdpi.com/1424-8220/23/7/3572
work_keys_str_mv	AT jianwenli dynamicobstacleavoidanceforusvsusingcrossdomaindeepreinforcementlearningandneuralnetworkmodelpredictivecontroller AT jalilchavezgalaviz dynamicobstacleavoidanceforusvsusingcrossdomaindeepreinforcementlearningandneuralnetworkmodelpredictivecontroller AT kamyarazizzadenesheli dynamicobstacleavoidanceforusvsusingcrossdomaindeepreinforcementlearningandneuralnetworkmodelpredictivecontroller AT ninamahmoudian dynamicobstacleavoidanceforusvsusingcrossdomaindeepreinforcementlearningandneuralnetworkmodelpredictivecontroller

Dynamic Obstacle Avoidance for USVs Using Cross-Domain Deep Reinforcement Learning and Neural Network Model Predictive Controller

Similar Items