Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding
Meta-reinforcement learning (meta-RL), used in the fault-tolerant control (FTC) problem, learns a meta-trained model from a set of fault situations that have a high-level similarity. However, in the real world, skid-steering vehicles might experience different types of fault situations. The use of a...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-02-01
|
Series: | Actuators |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-0825/11/3/72 |
_version_ | 1797473415447183360 |
---|---|
author | Huatong Dai Pengzhan Chen Hui Yang |
author_facet | Huatong Dai Pengzhan Chen Hui Yang |
author_sort | Huatong Dai |
collection | DOAJ |
description | Meta-reinforcement learning (meta-RL), used in the fault-tolerant control (FTC) problem, learns a meta-trained model from a set of fault situations that have a high-level similarity. However, in the real world, skid-steering vehicles might experience different types of fault situations. The use of a single initial meta-trained model limits the ability to learn different types of fault situations that do not possess a strong similarity. In this paper, we propose a novel FTC method to mitigate this limitation, by meta-training multiple initial meta-trained models and selecting the most suitable model to adapt to the fault situation. The proposed FTC method is based on the meta deep deterministic policy gradient (meta-DDPG) algorithm, which includes an offline stage and an online stage. In the offline stage, we first train multiple meta-trained models corresponding to different types of fault situations, and then a situation embedding model is trained with the state-transition data generated from meta-trained models. In the online stage, the most suitable meta-trained model is selected to adapt to the current fault situation. The simulation results demonstrate that the proposed FTC method allows skid-steering vehicles to adapt to different types of fault situations stably, while requiring significantly fewer fine-tuning steps than the baseline. |
first_indexed | 2024-03-09T20:14:11Z |
format | Article |
id | doaj.art-0a245ae9bbdf4762a66ee7c8619a42be |
institution | Directory Open Access Journal |
issn | 2076-0825 |
language | English |
last_indexed | 2024-03-09T20:14:11Z |
publishDate | 2022-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Actuators |
spelling | doaj.art-0a245ae9bbdf4762a66ee7c8619a42be2023-11-24T00:04:36ZengMDPI AGActuators2076-08252022-02-011137210.3390/act11030072Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation EmbeddingHuatong Dai0Pengzhan Chen1Hui Yang2School of Electrical Engineering and Automation, East China Jiaotong University, Nanchang 330013, ChinaSchool of Electrical Engineering and Automation, East China Jiaotong University, Nanchang 330013, ChinaSchool of Electrical Engineering and Automation, East China Jiaotong University, Nanchang 330013, ChinaMeta-reinforcement learning (meta-RL), used in the fault-tolerant control (FTC) problem, learns a meta-trained model from a set of fault situations that have a high-level similarity. However, in the real world, skid-steering vehicles might experience different types of fault situations. The use of a single initial meta-trained model limits the ability to learn different types of fault situations that do not possess a strong similarity. In this paper, we propose a novel FTC method to mitigate this limitation, by meta-training multiple initial meta-trained models and selecting the most suitable model to adapt to the fault situation. The proposed FTC method is based on the meta deep deterministic policy gradient (meta-DDPG) algorithm, which includes an offline stage and an online stage. In the offline stage, we first train multiple meta-trained models corresponding to different types of fault situations, and then a situation embedding model is trained with the state-transition data generated from meta-trained models. In the online stage, the most suitable meta-trained model is selected to adapt to the current fault situation. The simulation results demonstrate that the proposed FTC method allows skid-steering vehicles to adapt to different types of fault situations stably, while requiring significantly fewer fine-tuning steps than the baseline.https://www.mdpi.com/2076-0825/11/3/72fault-tolerant controlskid-steering vehiclereinforcement learning (RL)meta-learningsituation embedding |
spellingShingle | Huatong Dai Pengzhan Chen Hui Yang Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding Actuators fault-tolerant control skid-steering vehicle reinforcement learning (RL) meta-learning situation embedding |
title | Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding |
title_full | Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding |
title_fullStr | Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding |
title_full_unstemmed | Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding |
title_short | Fault-Tolerant Control of Skid Steering Vehicles Based on Meta-Reinforcement Learning with Situation Embedding |
title_sort | fault tolerant control of skid steering vehicles based on meta reinforcement learning with situation embedding |
topic | fault-tolerant control skid-steering vehicle reinforcement learning (RL) meta-learning situation embedding |
url | https://www.mdpi.com/2076-0825/11/3/72 |
work_keys_str_mv | AT huatongdai faulttolerantcontrolofskidsteeringvehiclesbasedonmetareinforcementlearningwithsituationembedding AT pengzhanchen faulttolerantcontrolofskidsteeringvehiclesbasedonmetareinforcementlearningwithsituationembedding AT huiyang faulttolerantcontrolofskidsteeringvehiclesbasedonmetareinforcementlearningwithsituationembedding |