Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction

Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions...

Full description

Bibliographic Details
Main Author: Seyde, Tim N.
Other Authors: Rus, Daniela
Format: Thesis
Published: Massachusetts Institute of Technology 2024
Online Access:https://hdl.handle.net/1721.1/156611
_version_ 1826215586353381376
author Seyde, Tim N.
author2 Rus, Daniela
author_facet Rus, Daniela
Seyde, Tim N.
author_sort Seyde, Tim N.
collection MIT
description Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes.
first_indexed 2024-09-23T16:36:09Z
format Thesis
id mit-1721.1/156611
institution Massachusetts Institute of Technology
last_indexed 2024-09-23T16:36:09Z
publishDate 2024
publisher Massachusetts Institute of Technology
record_format dspace
spelling mit-1721.1/1566112024-09-04T03:03:07Z Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction Seyde, Tim N. Rus, Daniela Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes. Ph.D. 2024-09-03T21:11:31Z 2024-09-03T21:11:31Z 2024-05 2024-07-10T13:02:05.050Z Thesis https://hdl.handle.net/1721.1/156611 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology
spellingShingle Seyde, Tim N.
Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_full Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_fullStr Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_full_unstemmed Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_short Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
title_sort efficient learning control via structural policy priors latent world models and hierarchical abstraction
url https://hdl.handle.net/1721.1/156611
work_keys_str_mv AT seydetimn efficientlearningcontrolviastructuralpolicypriorslatentworldmodelsandhierarchicalabstraction