Learning system adaptive legged robotic locomotion policies

<p>The ability to form support contacts at discontinuous locations makes legged robots suitable for locomotion over highly unstructured terrains. While recent years have witnessed significant robotic developments, delivering extremely dynamic and robust hardware solutions, the control intellig...

תיאור מלא

מידע ביבליוגרפי
מחבר ראשי:	Gangapurwala, S
מחברים אחרים:	Havoutis, I
פורמט:	Thesis
שפה:	English
יצא לאור:	2022
נושאים:	Robotics Reinforcement learning

_version_	1826311962790723584
author	Gangapurwala, S
author2	Havoutis, I
author_facet	Havoutis, I Gangapurwala, S
author_sort	Gangapurwala, S
collection	OXFORD
description	<p>The ability to form support contacts at discontinuous locations makes legged robots suitable for locomotion over highly unstructured terrains. While recent years have witnessed significant robotic developments, delivering extremely dynamic and robust hardware solutions, the control intelligence for legged robots to perform agile and sophisticated maneuvers remains an active area of research. This thesis, therefore, focuses on the control of legged systems, particularly, quadrupedal robots.</p> <p>The research presented in this thesis is driven by the motivation that <em>a controller governing the behavior of a system should thoroughly utilize its potential while also adapting to variations in system dynamics through emergence of behavior that still achieves the control objective</em>.</p> <p>Sampling-based search methods allow explorations along vast regions of operation, thereby, enabling discovery of near-optimal solutions. Similarly, data-driven reinforcement learning (RL) strategies allow development of controllers which exploit system dynamics to achieve the control objective described by a high-level reward function. The problem of legged robot locomotion can thus be approached using reinforcement learning to obtain robust and dynamic control solutions. Additionally, the control policy describing the behavior of the system can be parameterized as a deep neural network to perform complex non-linear mapping of the robot state information to desired control action. This approach of utilizing a deep neural network is then referred to as deep reinforcement learning. In this dissertation I focus on employing deep RL strategies for quadrupedal locomotion.</p> <p>I present that encouraging the RL control policy to even implicitly model system dynamics allows for emergence of adaptive control behaviors. I use this observation throughout this thesis to develop system adaptive control strategies leading up to an ambitious goal of obtaining an RL locomotion policy capable of zero-shot transfer to quadrupeds of varying kinematic and dynamic properties.</p> <p>Although the main contributions relate to RL, I also recognize the benefits and drawbacks of different control approaches, and thus explore modular control architectures which utilize both model-based and data-driven model-free strategies for robot locomotion. In this regard, I present training and control architectures which exhibit dynamic locomotion behavior over terrains with varying steps and inclines, both in lab experiments and field trials.</p> <p>In combination, the works presented in this thesis investigate, and, in my firm belief, further the state of legged robotic control advocating for development of artificial control intelligence which matches the level of complexity demonstrated by massively evolved biological counterparts.</p>
first_indexed	2024-03-07T08:19:01Z
format	Thesis
id	oxford-uuid:a18eeb06-ed54-47c0-a0f0-be992a609690
institution	University of Oxford
language	English
last_indexed	2024-03-07T08:19:01Z
publishDate	2022
record_format	dspace
spelling	oxford-uuid:a18eeb06-ed54-47c0-a0f0-be992a6096902024-01-17T09:13:14ZLearning system adaptive legged robotic locomotion policiesThesishttp://purl.org/coar/resource_type/c_db06uuid:a18eeb06-ed54-47c0-a0f0-be992a609690RoboticsReinforcement learningEnglishHyrax Deposit2022Gangapurwala, SHavoutis, IPosner, I<p>The ability to form support contacts at discontinuous locations makes legged robots suitable for locomotion over highly unstructured terrains. While recent years have witnessed significant robotic developments, delivering extremely dynamic and robust hardware solutions, the control intelligence for legged robots to perform agile and sophisticated maneuvers remains an active area of research. This thesis, therefore, focuses on the control of legged systems, particularly, quadrupedal robots.</p> <p>The research presented in this thesis is driven by the motivation that <em>a controller governing the behavior of a system should thoroughly utilize its potential while also adapting to variations in system dynamics through emergence of behavior that still achieves the control objective</em>.</p> <p>Sampling-based search methods allow explorations along vast regions of operation, thereby, enabling discovery of near-optimal solutions. Similarly, data-driven reinforcement learning (RL) strategies allow development of controllers which exploit system dynamics to achieve the control objective described by a high-level reward function. The problem of legged robot locomotion can thus be approached using reinforcement learning to obtain robust and dynamic control solutions. Additionally, the control policy describing the behavior of the system can be parameterized as a deep neural network to perform complex non-linear mapping of the robot state information to desired control action. This approach of utilizing a deep neural network is then referred to as deep reinforcement learning. In this dissertation I focus on employing deep RL strategies for quadrupedal locomotion.</p> <p>I present that encouraging the RL control policy to even implicitly model system dynamics allows for emergence of adaptive control behaviors. I use this observation throughout this thesis to develop system adaptive control strategies leading up to an ambitious goal of obtaining an RL locomotion policy capable of zero-shot transfer to quadrupeds of varying kinematic and dynamic properties.</p> <p>Although the main contributions relate to RL, I also recognize the benefits and drawbacks of different control approaches, and thus explore modular control architectures which utilize both model-based and data-driven model-free strategies for robot locomotion. In this regard, I present training and control architectures which exhibit dynamic locomotion behavior over terrains with varying steps and inclines, both in lab experiments and field trials.</p> <p>In combination, the works presented in this thesis investigate, and, in my firm belief, further the state of legged robotic control advocating for development of artificial control intelligence which matches the level of complexity demonstrated by massively evolved biological counterparts.</p>
spellingShingle	Robotics Reinforcement learning Gangapurwala, S Learning system adaptive legged robotic locomotion policies
title	Learning system adaptive legged robotic locomotion policies
title_full	Learning system adaptive legged robotic locomotion policies
title_fullStr	Learning system adaptive legged robotic locomotion policies
title_full_unstemmed	Learning system adaptive legged robotic locomotion policies
title_short	Learning system adaptive legged robotic locomotion policies
title_sort	learning system adaptive legged robotic locomotion policies
topic	Robotics Reinforcement learning
work_keys_str_mv	AT gangapurwalas learningsystemadaptiveleggedroboticlocomotionpolicies

Learning system adaptive legged robotic locomotion policies

פריטים דומים