Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning

External disturbance poses the primary threat to robot balance in dynamic environments. This paper provides a learning-based control architecture for quadrupedal self-balancing, which is adaptable to multiple unpredictable scenes of external continuous disturbance. Different from conventional method...

Full description

Bibliographic Details
Main Authors:	Haoran Sun, Tingting Fu, Yuanhuai Ling, Chaoming He
Format:	Article
Language:	English
Published:	MDPI AG 2021-09-01
Series:	Sensors
Subjects:	quadruped robot multi-contact balance control reinforcement learning (RL) artificial neural networks (ANN) soft actor-critic (SAC)
Online Access:	https://www.mdpi.com/1424-8220/21/17/5907

_version_	1797520825894567936
author	Haoran Sun Tingting Fu Yuanhuai Ling Chaoming He
author_facet	Haoran Sun Tingting Fu Yuanhuai Ling Chaoming He
author_sort	Haoran Sun
collection	DOAJ
description	External disturbance poses the primary threat to robot balance in dynamic environments. This paper provides a learning-based control architecture for quadrupedal self-balancing, which is adaptable to multiple unpredictable scenes of external continuous disturbance. Different from conventional methods which construct analytical models which explicitly reason the balancing process, our work utilized reinforcement learning and artificial neural network to avoid incomprehensible mathematical modeling. The control policy is composed of a neural network and a Tanh Gaussian policy, which implicitly establishes the fuzzy mapping from proprioceptive signals to action commands. During the training process, the maximum-entropy method (soft actor-critic algorithm) is employed to endow the policy with powerful exploration and generalization ability. The trained policy is validated in both simulations and realistic experiments with a customized quadruped robot. The results demonstrate that the policy can be easily transferred to the real world without elaborate configurations. Moreover, although this policy is trained in merely one specific vibration condition, it demonstrates robustness under conditions that were never encountered during training.
first_indexed	2024-03-10T08:04:02Z
format	Article
id	doaj.art-82ca97d605df404f93998b5c809bf278
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-10T08:04:02Z
publishDate	2021-09-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-82ca97d605df404f93998b5c809bf2782023-11-22T11:14:16ZengMDPI AGSensors1424-82202021-09-012117590710.3390/s21175907Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement LearningHaoran Sun0Tingting Fu1Yuanhuai Ling2Chaoming He3School of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaSchool of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaSchool of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaSchool of Mechanical Engineering, Southwest Jiaotong University, Chengdu 610031, ChinaExternal disturbance poses the primary threat to robot balance in dynamic environments. This paper provides a learning-based control architecture for quadrupedal self-balancing, which is adaptable to multiple unpredictable scenes of external continuous disturbance. Different from conventional methods which construct analytical models which explicitly reason the balancing process, our work utilized reinforcement learning and artificial neural network to avoid incomprehensible mathematical modeling. The control policy is composed of a neural network and a Tanh Gaussian policy, which implicitly establishes the fuzzy mapping from proprioceptive signals to action commands. During the training process, the maximum-entropy method (soft actor-critic algorithm) is employed to endow the policy with powerful exploration and generalization ability. The trained policy is validated in both simulations and realistic experiments with a customized quadruped robot. The results demonstrate that the policy can be easily transferred to the real world without elaborate configurations. Moreover, although this policy is trained in merely one specific vibration condition, it demonstrates robustness under conditions that were never encountered during training.https://www.mdpi.com/1424-8220/21/17/5907quadruped robotmulti-contact balance controlreinforcement learning (RL)artificial neural networks (ANN)soft actor-critic (SAC)
spellingShingle	Haoran Sun Tingting Fu Yuanhuai Ling Chaoming He Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning Sensors quadruped robot multi-contact balance control reinforcement learning (RL) artificial neural networks (ANN) soft actor-critic (SAC)
title	Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning
title_full	Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning
title_fullStr	Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning
title_full_unstemmed	Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning
title_short	Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning
title_sort	adaptive quadruped balance control for dynamic environments using maximum entropy reinforcement learning
topic	quadruped robot multi-contact balance control reinforcement learning (RL) artificial neural networks (ANN) soft actor-critic (SAC)
url	https://www.mdpi.com/1424-8220/21/17/5907
work_keys_str_mv	AT haoransun adaptivequadrupedbalancecontrolfordynamicenvironmentsusingmaximumentropyreinforcementlearning AT tingtingfu adaptivequadrupedbalancecontrolfordynamicenvironmentsusingmaximumentropyreinforcementlearning AT yuanhuailing adaptivequadrupedbalancecontrolfordynamicenvironmentsusingmaximumentropyreinforcementlearning AT chaominghe adaptivequadrupedbalancecontrolfordynamicenvironmentsusingmaximumentropyreinforcementlearning

Adaptive Quadruped Balance Control for Dynamic Environments Using Maximum-Entropy Reinforcement Learning

Similar Items