Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization

Autonomous Underwater Vehicles (AUVs) or underwater vehicle-manipulator systems often have large model uncertainties from degenerated or damaged thrusters, varying payloads, disturbances from currents, etc. Other constraints, such as input dead zones and saturations, make the feedback controllers di...

Full description

Bibliographic Details
Main Authors:	Wenjie Lu, Kai Cheng, Manman Hu
Format:	Article
Language:	English
Published:	MDPI AG 2023-01-01
Series:	Applied Sciences
Subjects:	autonomous underwater vehicles uncertainty attenuation reinforcement learning domain randomization
Online Access:	https://www.mdpi.com/2076-3417/13/3/1723

_version_	1797625096785887232
author	Wenjie Lu Kai Cheng Manman Hu
author_facet	Wenjie Lu Kai Cheng Manman Hu
author_sort	Wenjie Lu
collection	DOAJ
description	Autonomous Underwater Vehicles (AUVs) or underwater vehicle-manipulator systems often have large model uncertainties from degenerated or damaged thrusters, varying payloads, disturbances from currents, etc. Other constraints, such as input dead zones and saturations, make the feedback controllers difficult to tune online. Model-free Reinforcement Learning (RL) has been applied to control AUVs, but most results were validated through numerical simulations. The trained controllers often perform unsatisfactorily on real AUVs; this is because the distributions of the AUV dynamics in numerical simulations and those of real AUVs are mismatched. This paper presents a model-free RL via Data-informed Domain Randomization (DDR) for controlling AUVs, where the mismatches between the trajectory data from numerical simulations and the real AUV were minimized by adjusting the parameters in the simulated AUVs. The DDR strategy extends the existing adaptive domain randomization technique by aggregating an input network to learn mappings between control signals across domains, enabling the controller to adapt to sudden changes in dynamics. The proposed RL via DDR was tested on the problems of AUV pose regulation through extensive numerical simulations and experiments in a lab tank with an underwater positioning system. These results have demonstrated the effectiveness of RL-DDR for transferring trained controllers to AUVs with different dynamics.
first_indexed	2024-03-11T09:52:00Z
format	Article
id	doaj.art-5cfdd220ab5f4239b1af5c1befbac8e0
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-11T09:52:00Z
publishDate	2023-01-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-5cfdd220ab5f4239b1af5c1befbac8e02023-11-16T16:09:29ZengMDPI AGApplied Sciences2076-34172023-01-01133172310.3390/app13031723Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain RandomizationWenjie Lu0Kai Cheng1Manman Hu2School of Mechanical Engineering and Automation, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, ChinaSchool of Mechanical Engineering and Automation, Harbin Institute of Technology (Shenzhen), Shenzhen 518055, ChinaDepartment of Civil Engineering, University of Hong Kong, Hong Kong, ChinaAutonomous Underwater Vehicles (AUVs) or underwater vehicle-manipulator systems often have large model uncertainties from degenerated or damaged thrusters, varying payloads, disturbances from currents, etc. Other constraints, such as input dead zones and saturations, make the feedback controllers difficult to tune online. Model-free Reinforcement Learning (RL) has been applied to control AUVs, but most results were validated through numerical simulations. The trained controllers often perform unsatisfactorily on real AUVs; this is because the distributions of the AUV dynamics in numerical simulations and those of real AUVs are mismatched. This paper presents a model-free RL via Data-informed Domain Randomization (DDR) for controlling AUVs, where the mismatches between the trajectory data from numerical simulations and the real AUV were minimized by adjusting the parameters in the simulated AUVs. The DDR strategy extends the existing adaptive domain randomization technique by aggregating an input network to learn mappings between control signals across domains, enabling the controller to adapt to sudden changes in dynamics. The proposed RL via DDR was tested on the problems of AUV pose regulation through extensive numerical simulations and experiments in a lab tank with an underwater positioning system. These results have demonstrated the effectiveness of RL-DDR for transferring trained controllers to AUVs with different dynamics.https://www.mdpi.com/2076-3417/13/3/1723autonomous underwater vehiclesuncertainty attenuationreinforcement learningdomain randomization
spellingShingle	Wenjie Lu Kai Cheng Manman Hu Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization Applied Sciences autonomous underwater vehicles uncertainty attenuation reinforcement learning domain randomization
title	Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization
title_full	Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization
title_fullStr	Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization
title_full_unstemmed	Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization
title_short	Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization
title_sort	reinforcement learning for autonomous underwater vehicles via data informed domain randomization
topic	autonomous underwater vehicles uncertainty attenuation reinforcement learning domain randomization
url	https://www.mdpi.com/2076-3417/13/3/1723
work_keys_str_mv	AT wenjielu reinforcementlearningforautonomousunderwatervehiclesviadatainformeddomainrandomization AT kaicheng reinforcementlearningforautonomousunderwatervehiclesviadatainformeddomainrandomization AT manmanhu reinforcementlearningforautonomousunderwatervehiclesviadatainformeddomainrandomization

Reinforcement Learning for Autonomous Underwater Vehicles via Data-Informed Domain Randomization

Similar Items