Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction

Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions...

Full description

Bibliographic Details
Main Author:	Seyde, Tim N.
Other Authors:	Rus, Daniela
Format:	Thesis
Published:	Massachusetts Institute of Technology 2024
Online Access:	https://hdl.handle.net/1721.1/156611

_version_	1826215586353381376
author	Seyde, Tim N.
author2	Rus, Daniela
author_facet	Rus, Daniela Seyde, Tim N.
author_sort	Seyde, Tim N.
collection	MIT
description	Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes.
first_indexed	2024-09-23T16:36:09Z
format	Thesis
id	mit-1721.1/156611
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T16:36:09Z
publishDate	2024
publisher	Massachusetts Institute of Technology
record_format	dspace
spelling	mit-1721.1/1566112024-09-04T03:03:07Z Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction Seyde, Tim N. Rus, Daniela Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes. Ph.D. 2024-09-03T21:11:31Z 2024-09-03T21:11:31Z 2024-05 2024-07-10T13:02:05.050Z Thesis https://hdl.handle.net/1721.1/156611 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology
spellingShingle	Seyde, Tim N. Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title	Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_full	Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_fullStr	Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_full_unstemmed	Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_short	Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_sort	efficient learning control via structural policy priors latent world models and hierarchical abstraction
url	https://hdl.handle.net/1721.1/156611
work_keys_str_mv	AT seydetimn efficientlearningcontrolviastructuralpolicypriorslatentworldmodelsandhierarchicalabstraction

Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction

Similar Items