Learning and deploying robust locomotion policies with minimal dynamics randomization

Training Deep Reinforcement Learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behavior. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, xhaustively engineered approaches such as system identificati...

Full description

Bibliographic Details
Main Authors:	Campanaro, L, Gangapurwala, S, Merkt, W, Havoutis, I
Format:	Conference item
Language:	English
Published:	Proceedings of Machine Learning Research 2024

_version_	1826315119683960832
author	Campanaro, L Gangapurwala, S Merkt, W Havoutis, I
author_facet	Campanaro, L Gangapurwala, S Merkt, W Havoutis, I
author_sort	Campanaro, L
collection	OXFORD
description	Training Deep Reinforcement Learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behavior. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, xhaustively engineered approaches such as system identification, dynamics randomization, and domain adaptation are generally employed. As an alternative, we investigate a simple strategy of random force injection (RFI) to perturb system dynamics during training. We show that the application of random forces enables us to emulate dynamics randomization. This allows us to obtain locomotion policies that are robust to variations in system dynamics. We further extend RFI, referred to as extended random force injection (ERFI), by introducing an episodic actuation offset. We demonstrate that ERFI provides additional robustness for variations in system mass offering on average a 53% improved performance over RFI. We also show that ERFI is sufficient to perform a successful sim-to-real transfer on two different quadrupedal platforms, ANYmal C and Unitree A1, even for perceptive locomotion over uneven terrain in outdoor environments.
first_indexed	2024-12-09T03:20:09Z
format	Conference item
id	oxford-uuid:507b9bd0-19ad-4ba4-b2dd-4afa255cfc86
institution	University of Oxford
language	English
last_indexed	2024-12-09T03:20:09Z
publishDate	2024
publisher	Proceedings of Machine Learning Research
record_format	dspace
spelling	oxford-uuid:507b9bd0-19ad-4ba4-b2dd-4afa255cfc862024-11-05T11:50:17ZLearning and deploying robust locomotion policies with minimal dynamics randomizationConference itemhttp://purl.org/coar/resource_type/c_5794uuid:507b9bd0-19ad-4ba4-b2dd-4afa255cfc86EnglishSymplectic ElementsProceedings of Machine Learning Research2024Campanaro, LGangapurwala, SMerkt, WHavoutis, ITraining Deep Reinforcement Learning (DRL) locomotion policies often require massive amounts of data to converge to the desired behavior. In this regard, simulators provide a cheap and abundant source. For successful sim-to-real transfer, xhaustively engineered approaches such as system identification, dynamics randomization, and domain adaptation are generally employed. As an alternative, we investigate a simple strategy of random force injection (RFI) to perturb system dynamics during training. We show that the application of random forces enables us to emulate dynamics randomization. This allows us to obtain locomotion policies that are robust to variations in system dynamics. We further extend RFI, referred to as extended random force injection (ERFI), by introducing an episodic actuation offset. We demonstrate that ERFI provides additional robustness for variations in system mass offering on average a 53% improved performance over RFI. We also show that ERFI is sufficient to perform a successful sim-to-real transfer on two different quadrupedal platforms, ANYmal C and Unitree A1, even for perceptive locomotion over uneven terrain in outdoor environments.
spellingShingle	Campanaro, L Gangapurwala, S Merkt, W Havoutis, I Learning and deploying robust locomotion policies with minimal dynamics randomization
title	Learning and deploying robust locomotion policies with minimal dynamics randomization
title_full	Learning and deploying robust locomotion policies with minimal dynamics randomization
title_fullStr	Learning and deploying robust locomotion policies with minimal dynamics randomization
title_full_unstemmed	Learning and deploying robust locomotion policies with minimal dynamics randomization
title_short	Learning and deploying robust locomotion policies with minimal dynamics randomization
title_sort	learning and deploying robust locomotion policies with minimal dynamics randomization
work_keys_str_mv	AT campanarol learninganddeployingrobustlocomotionpolicieswithminimaldynamicsrandomization AT gangapurwalas learninganddeployingrobustlocomotionpolicieswithminimaldynamicsrandomization AT merktw learninganddeployingrobustlocomotionpolicieswithminimaldynamicsrandomization AT havoutisi learninganddeployingrobustlocomotionpolicieswithminimaldynamicsrandomization

Learning and deploying robust locomotion policies with minimal dynamics randomization

Similar Items