Factored State Abstraction for Option Learning

Hierarchical reinforcement learning has focused on discovering temporally extended actions (options) to provide efficient solutions for long-horizon decision-making problems with sparse rewards. One promising approach that learns these options end-toend in this setting is the option-critic (OC) fram...

Full description

Bibliographic Details
Main Author: Abdulhai, Marwa
Other Authors: How, Jonathan P.
Format: Thesis
Published: Massachusetts Institute of Technology 2022
Online Access:https://hdl.handle.net/1721.1/140090
Description
Summary:Hierarchical reinforcement learning has focused on discovering temporally extended actions (options) to provide efficient solutions for long-horizon decision-making problems with sparse rewards. One promising approach that learns these options end-toend in this setting is the option-critic (OC) framework. However, there are several practical limitations of this method, including the lack of diversity between the learned sub-policies and sample inefficiency. This thesis shows that the OC framework does not decompose problems into smaller and largely independent components, but instead increases the problem complexity with each option by considering the entire state space during learning. To address this issue, we introduce state abstracted option-critic (SOC), a new framework that considers both temporal and state abstraction to effectively reduce the problem complexity in sparse reward settings. Our contribution includes learning a factored state space to enable each option to map to a sub-section of the state space. We test our method against hierarchical, nonhierarchical, and state abstraction baselines to demonstrate better sample efficiency and higher overall performance in both image and large vector-state representations under sparse reward settings.