Invariant causal prediction for block MDPs

Generalization across environments is critical to the successful application of reinforcement learning (RL) algorithms to real-world challenges. In this work we propose a method for learning state abstractions which generalize to novel observation distributions in the multi-environment RL setting. W...

Ful tanımlama

Detaylı Bibliyografya
Asıl Yazarlar: Zhang, A, Lyle, C, Sodhani, S, Filos, A, Kwiatkowska, M, Pineau, J, Gal, Y, Precup, D
Materyal Türü: Conference item
Dil:English
Baskı/Yayın Bilgisi: Proceedings of Machine Learning Research 2020
_version_ 1826297353439543296
author Zhang, A
Lyle, C
Sodhani, S
Filos, A
Kwiatkowska, M
Pineau, J
Gal, Y
Precup, D
author_facet Zhang, A
Lyle, C
Sodhani, S
Filos, A
Kwiatkowska, M
Pineau, J
Gal, Y
Precup, D
author_sort Zhang, A
collection OXFORD
description Generalization across environments is critical to the successful application of reinforcement learning (RL) algorithms to real-world challenges. In this work we propose a method for learning state abstractions which generalize to novel observation distributions in the multi-environment RL setting. We prove that for certain classes of environments, this approach outputs, with high probability, a state abstraction corresponding to the causal feature set with respect to the return. We give empirical evidence that analogous methods for the nonlinear setting can also attain improved generalization over single- and multi-task baselines. Lastly, we provide bounds on model generalization error in the multi-environment setting, in the process showing a connection between causal variable identification and the state abstraction framework for MDPs.
first_indexed 2024-03-07T04:30:18Z
format Conference item
id oxford-uuid:ce1007dd-df7a-43c6-aabd-a9fa70cff47f
institution University of Oxford
language English
last_indexed 2024-03-07T04:30:18Z
publishDate 2020
publisher Proceedings of Machine Learning Research
record_format dspace
spelling oxford-uuid:ce1007dd-df7a-43c6-aabd-a9fa70cff47f2022-03-27T07:33:13ZInvariant causal prediction for block MDPsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:ce1007dd-df7a-43c6-aabd-a9fa70cff47fEnglishSymplectic ElementsProceedings of Machine Learning Research2020Zhang, ALyle, CSodhani, SFilos, AKwiatkowska, MPineau, JGal, YPrecup, DGeneralization across environments is critical to the successful application of reinforcement learning (RL) algorithms to real-world challenges. In this work we propose a method for learning state abstractions which generalize to novel observation distributions in the multi-environment RL setting. We prove that for certain classes of environments, this approach outputs, with high probability, a state abstraction corresponding to the causal feature set with respect to the return. We give empirical evidence that analogous methods for the nonlinear setting can also attain improved generalization over single- and multi-task baselines. Lastly, we provide bounds on model generalization error in the multi-environment setting, in the process showing a connection between causal variable identification and the state abstraction framework for MDPs.
spellingShingle Zhang, A
Lyle, C
Sodhani, S
Filos, A
Kwiatkowska, M
Pineau, J
Gal, Y
Precup, D
Invariant causal prediction for block MDPs
title Invariant causal prediction for block MDPs
title_full Invariant causal prediction for block MDPs
title_fullStr Invariant causal prediction for block MDPs
title_full_unstemmed Invariant causal prediction for block MDPs
title_short Invariant causal prediction for block MDPs
title_sort invariant causal prediction for block mdps
work_keys_str_mv AT zhanga invariantcausalpredictionforblockmdps
AT lylec invariantcausalpredictionforblockmdps
AT sodhanis invariantcausalpredictionforblockmdps
AT filosa invariantcausalpredictionforblockmdps
AT kwiatkowskam invariantcausalpredictionforblockmdps
AT pineauj invariantcausalpredictionforblockmdps
AT galy invariantcausalpredictionforblockmdps
AT precupd invariantcausalpredictionforblockmdps