Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning

To achieve persistent systems in the future, autonomous underwater vehicles (AUVs) will need to autonomously dock onto a charging station. Here, reinforcement learning strategies were applied for the first time to control the docking of an AUV onto a fixed platform in a simulation environment. Two r...

Full description

Bibliographic Details
Main Authors:	Enrico Anderlini, Gordon G. Parker, Giles Thomas
Format:	Article
Language:	English
Published:	MDPI AG 2019-08-01
Series:	Applied Sciences
Subjects:	autonomous underwater vehicle reinforcement learning optimal control
Online Access:	https://www.mdpi.com/2076-3417/9/17/3456

_version_	1817997110337863680
author	Enrico Anderlini Gordon G. Parker Giles Thomas
author_facet	Enrico Anderlini Gordon G. Parker Giles Thomas
author_sort	Enrico Anderlini
collection	DOAJ
description	To achieve persistent systems in the future, autonomous underwater vehicles (AUVs) will need to autonomously dock onto a charging station. Here, reinforcement learning strategies were applied for the first time to control the docking of an AUV onto a fixed platform in a simulation environment. Two reinforcement learning schemes were investigated: one with continuous state and action spaces, deep deterministic policy gradient (DDPG), and one with continuous state but discrete action spaces, deep Q network (DQN). For DQN, the discrete actions were selected as step changes in the control input signals. The performance of the reinforcement learning strategies was compared with classical and optimal control techniques. The control actions selected by DDPG suffer from chattering effects due to a hyperbolic tangent layer in the actor. Conversely, DQN presents the best compromise between short docking time and low control effort, whilst meeting the docking requirements. Whereas the reinforcement learning algorithms present a very high computational cost at training time, they are five orders of magnitude faster than optimal control at deployment time, thus enabling an on-line implementation. Therefore, reinforcement learning achieves a performance similar to optimal control at a much lower computational cost at deployment, whilst also presenting a more general framework.
first_indexed	2024-04-14T02:33:27Z
format	Article
id	doaj.art-8872a4bd80b94375bb2267851ee18557
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-04-14T02:33:27Z
publishDate	2019-08-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-8872a4bd80b94375bb2267851ee185572022-12-22T02:17:36ZengMDPI AGApplied Sciences2076-34172019-08-01917345610.3390/app9173456app9173456Docking Control of an Autonomous Underwater Vehicle Using Reinforcement LearningEnrico Anderlini0Gordon G. Parker1Giles Thomas2Department of Mechanical Engineering, University College London, London WC1E 7JE, UKDepartment of Mechanical Engineering—Engineering Mechanics, Michigan Technological University, Houghton, MI 49931, USADepartment of Mechanical Engineering, University College London, London WC1E 7JE, UKTo achieve persistent systems in the future, autonomous underwater vehicles (AUVs) will need to autonomously dock onto a charging station. Here, reinforcement learning strategies were applied for the first time to control the docking of an AUV onto a fixed platform in a simulation environment. Two reinforcement learning schemes were investigated: one with continuous state and action spaces, deep deterministic policy gradient (DDPG), and one with continuous state but discrete action spaces, deep Q network (DQN). For DQN, the discrete actions were selected as step changes in the control input signals. The performance of the reinforcement learning strategies was compared with classical and optimal control techniques. The control actions selected by DDPG suffer from chattering effects due to a hyperbolic tangent layer in the actor. Conversely, DQN presents the best compromise between short docking time and low control effort, whilst meeting the docking requirements. Whereas the reinforcement learning algorithms present a very high computational cost at training time, they are five orders of magnitude faster than optimal control at deployment time, thus enabling an on-line implementation. Therefore, reinforcement learning achieves a performance similar to optimal control at a much lower computational cost at deployment, whilst also presenting a more general framework.https://www.mdpi.com/2076-3417/9/17/3456autonomous underwater vehiclereinforcement learningoptimal control
spellingShingle	Enrico Anderlini Gordon G. Parker Giles Thomas Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning Applied Sciences autonomous underwater vehicle reinforcement learning optimal control
title	Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning
title_full	Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning
title_fullStr	Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning
title_full_unstemmed	Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning
title_short	Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning
title_sort	docking control of an autonomous underwater vehicle using reinforcement learning
topic	autonomous underwater vehicle reinforcement learning optimal control
url	https://www.mdpi.com/2076-3417/9/17/3456
work_keys_str_mv	AT enricoanderlini dockingcontrolofanautonomousunderwatervehicleusingreinforcementlearning AT gordongparker dockingcontrolofanautonomousunderwatervehicleusingreinforcementlearning AT gilesthomas dockingcontrolofanautonomousunderwatervehicleusingreinforcementlearning

Docking Control of an Autonomous Underwater Vehicle Using Reinforcement Learning

Similar Items