Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle

The Deep Reinforcement Learning (DRL) algorithm is an optimal control method with generalization capacity for complex nonlinear coupled systems. However, the DRL agent maintains control command saturation and response overshoot to achieve the fastest response. In this study, a reference model-based...

Full description

Bibliographic Details
Main Authors:	Jiqing Du, Dan Zhou, Wei Wang, Sachiyo Arai
Format:	Article
Language:	English
Published:	MDPI AG 2023-03-01
Series:	Journal of Marine Science and Engineering
Subjects:	autonomous underwater vehicle reference model Model-Reference Twin Delayed Deep Deterministic (MR-TD3) pitch and depth control
Online Access:	https://www.mdpi.com/2077-1312/11/3/588

_version_	1827749220995563520
author	Jiqing Du Dan Zhou Wei Wang Sachiyo Arai
author_facet	Jiqing Du Dan Zhou Wei Wang Sachiyo Arai
author_sort	Jiqing Du
collection	DOAJ
description	The Deep Reinforcement Learning (DRL) algorithm is an optimal control method with generalization capacity for complex nonlinear coupled systems. However, the DRL agent maintains control command saturation and response overshoot to achieve the fastest response. In this study, a reference model-based DRL control strategy termed Model-Reference Twin Delayed Deep Deterministic (MR-TD3) was proposed for controlling the pitch attitude and depth of an autonomous underwater vehicle (AUV) system. First, a reference model based on an actual AUV system was introduced to an actor–critic structure, where the input of the model was the reference target, the outputs were the smoothed reference targets, and the reference model parameters can adjust the response time and the smoothness. The input commands were limited to the saturation range. Then, the model state, the real state and the reference target were mapped to the control command through the Twin Delayed Deep Deterministic (TD3) agent for training. Finally, the trained neural network was applied to the AUV system environment for pitch and depth experiments. The results demonstrated that the controller can eliminate the response overshoot and control command saturation while improving the robustness, and the method also can extend to other control platforms such as autonomous guided vehicle or unmanned aerial vehicle.
first_indexed	2024-03-11T06:19:46Z
format	Article
id	doaj.art-07557130f39e4944938c315f6bb847f1
institution	Directory Open Access Journal
issn	2077-1312
language	English
last_indexed	2024-03-11T06:19:46Z
publishDate	2023-03-01
publisher	MDPI AG
record_format	Article
series	Journal of Marine Science and Engineering
spelling	doaj.art-07557130f39e4944938c315f6bb847f12023-11-17T11:57:42ZengMDPI AGJournal of Marine Science and Engineering2077-13122023-03-0111358810.3390/jmse11030588Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater VehicleJiqing Du0Dan Zhou1Wei Wang2Sachiyo Arai3Graduate School of Science and Engineering, Chiba University, Chiba 263-8522, JapanGraduate School of Science and Engineering, Chiba University, Chiba 263-8522, JapanJiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology (CICAEET), Nanjing University of Information Science & Technology, Nanjing 210044, ChinaGraduate School of Science and Engineering, Chiba University, Chiba 263-8522, JapanThe Deep Reinforcement Learning (DRL) algorithm is an optimal control method with generalization capacity for complex nonlinear coupled systems. However, the DRL agent maintains control command saturation and response overshoot to achieve the fastest response. In this study, a reference model-based DRL control strategy termed Model-Reference Twin Delayed Deep Deterministic (MR-TD3) was proposed for controlling the pitch attitude and depth of an autonomous underwater vehicle (AUV) system. First, a reference model based on an actual AUV system was introduced to an actor–critic structure, where the input of the model was the reference target, the outputs were the smoothed reference targets, and the reference model parameters can adjust the response time and the smoothness. The input commands were limited to the saturation range. Then, the model state, the real state and the reference target were mapped to the control command through the Twin Delayed Deep Deterministic (TD3) agent for training. Finally, the trained neural network was applied to the AUV system environment for pitch and depth experiments. The results demonstrated that the controller can eliminate the response overshoot and control command saturation while improving the robustness, and the method also can extend to other control platforms such as autonomous guided vehicle or unmanned aerial vehicle.https://www.mdpi.com/2077-1312/11/3/588autonomous underwater vehiclereference modelModel-Reference Twin Delayed Deep Deterministic (MR-TD3)pitch and depth control
spellingShingle	Jiqing Du Dan Zhou Wei Wang Sachiyo Arai Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle Journal of Marine Science and Engineering autonomous underwater vehicle reference model Model-Reference Twin Delayed Deep Deterministic (MR-TD3) pitch and depth control
title	Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle
title_full	Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle
title_fullStr	Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle
title_full_unstemmed	Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle
title_short	Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle
title_sort	reference model based deterministic policy for pitch and depth control of autonomous underwater vehicle
topic	autonomous underwater vehicle reference model Model-Reference Twin Delayed Deep Deterministic (MR-TD3) pitch and depth control
url	https://www.mdpi.com/2077-1312/11/3/588
work_keys_str_mv	AT jiqingdu referencemodelbaseddeterministicpolicyforpitchanddepthcontrolofautonomousunderwatervehicle AT danzhou referencemodelbaseddeterministicpolicyforpitchanddepthcontrolofautonomousunderwatervehicle AT weiwang referencemodelbaseddeterministicpolicyforpitchanddepthcontrolofautonomousunderwatervehicle AT sachiyoarai referencemodelbaseddeterministicpolicyforpitchanddepthcontrolofautonomousunderwatervehicle

Reference Model-Based Deterministic Policy for Pitch and Depth Control of Autonomous Underwater Vehicle

Similar Items