Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction
Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions...
Main Author: | |
---|---|
Other Authors: | |
Format: | Thesis |
Published: |
Massachusetts Institute of Technology
2024
|
Online Access: | https://hdl.handle.net/1721.1/156611 |
_version_ | 1826215586353381376 |
---|---|
author | Seyde, Tim N. |
author2 | Rus, Daniela |
author_facet | Rus, Daniela Seyde, Tim N. |
author_sort | Seyde, Tim N. |
collection | MIT |
description | Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes. |
first_indexed | 2024-09-23T16:36:09Z |
format | Thesis |
id | mit-1721.1/156611 |
institution | Massachusetts Institute of Technology |
last_indexed | 2024-09-23T16:36:09Z |
publishDate | 2024 |
publisher | Massachusetts Institute of Technology |
record_format | dspace |
spelling | mit-1721.1/1566112024-09-04T03:03:07Z Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction Seyde, Tim N. Rus, Daniela Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Learning control will enable the deployment of autonomous robots in unstructured real-world settings. Solving the associated complex decision processes under real-time constraints will require intuition, guiding current actions by prior experience to anticipate long-horizon environment interactions and integrating with optimal control to ground action selection in short-horizon system constraints. Ensuring tractability of the underlying learning process is conditional upon maximizing task-aligned information extracted from environment interactions while minimizing the required guidance via human interventions. In this thesis, we develop novel learning control algorithms that enable efficient acquisition of complex behaviors while limiting prior knowledge, direct human supervision, and computational requirements. Our study focuses on learning from interaction through reinforcement learning, combining insights from model-free, model-based, and hierarchical techniques. We design decoupled discrete policy structures to yield memory-efficient agent representations. Our study demonstrates the competitive performance of critic-only agents on continuous control tasks, highlighting accelerated information propagation and exploration benefits. We further leverage hierarchical abstraction over diverse behavior components to enable time-efficient optimization. Our methods jointly learn heterogeneous low-level controller parameterizations via mixture policies for single-agent control while decoupling multi-timescale strategic from reactive reasoning in the context of multi-agent team coordination. We lastly build latent world models for multi-step reasoning and sample efficient interaction selection. Our work employs uncertainty over expected long-term returns for targeted deep exploration and constructs multi-agent interaction models to accelerate competitive behavior learning via self-play in imagination. In sum, this thesis develops scalable and efficient robot learning algorithms by addressing representational challenges across layers of abstractions, providing agents with an intrinsic ability to set implicit exploration goals under high-level guidance, and facilitating information propagation in limited data regimes. Ph.D. 2024-09-03T21:11:31Z 2024-09-03T21:11:31Z 2024-05 2024-07-10T13:02:05.050Z Thesis https://hdl.handle.net/1721.1/156611 Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/ application/pdf Massachusetts Institute of Technology |
spellingShingle | Seyde, Tim N. Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction |
title | Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction |
title_full | Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction |
title_fullStr | Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction |
title_full_unstemmed | Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction |
title_short | Efficient Learning Control via Structural Policy Priors, Latent World Models and Hierarchical Abstraction |
title_sort | efficient learning control via structural policy priors latent world models and hierarchical abstraction |
url | https://hdl.handle.net/1721.1/156611 |
work_keys_str_mv | AT seydetimn efficientlearningcontrolviastructuralpolicypriorslatentworldmodelsandhierarchicalabstraction |