Invariant causal prediction for block MDPs
Generalization across environments is critical to the successful application of reinforcement learning (RL) algorithms to real-world challenges. In this work we propose a method for learning state abstractions which generalize to novel observation distributions in the multi-environment RL setting. W...
Asıl Yazarlar: | , , , , , , , |
---|---|
Materyal Türü: | Conference item |
Dil: | English |
Baskı/Yayın Bilgisi: |
Proceedings of Machine Learning Research
2020
|
_version_ | 1826297353439543296 |
---|---|
author | Zhang, A Lyle, C Sodhani, S Filos, A Kwiatkowska, M Pineau, J Gal, Y Precup, D |
author_facet | Zhang, A Lyle, C Sodhani, S Filos, A Kwiatkowska, M Pineau, J Gal, Y Precup, D |
author_sort | Zhang, A |
collection | OXFORD |
description | Generalization across environments is critical to the successful application of reinforcement learning (RL) algorithms to real-world challenges. In this work we propose a method for learning state abstractions which generalize to novel observation distributions in the multi-environment RL setting. We prove that for certain classes of environments, this approach outputs, with high probability, a state abstraction corresponding to the causal feature set with respect to the return. We give empirical evidence that analogous methods for the nonlinear setting can also attain improved generalization over single- and multi-task baselines. Lastly, we provide bounds on model generalization error in the multi-environment setting, in the process showing a connection between causal variable identification and the state abstraction framework for MDPs. |
first_indexed | 2024-03-07T04:30:18Z |
format | Conference item |
id | oxford-uuid:ce1007dd-df7a-43c6-aabd-a9fa70cff47f |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-07T04:30:18Z |
publishDate | 2020 |
publisher | Proceedings of Machine Learning Research |
record_format | dspace |
spelling | oxford-uuid:ce1007dd-df7a-43c6-aabd-a9fa70cff47f2022-03-27T07:33:13ZInvariant causal prediction for block MDPsConference itemhttp://purl.org/coar/resource_type/c_5794uuid:ce1007dd-df7a-43c6-aabd-a9fa70cff47fEnglishSymplectic ElementsProceedings of Machine Learning Research2020Zhang, ALyle, CSodhani, SFilos, AKwiatkowska, MPineau, JGal, YPrecup, DGeneralization across environments is critical to the successful application of reinforcement learning (RL) algorithms to real-world challenges. In this work we propose a method for learning state abstractions which generalize to novel observation distributions in the multi-environment RL setting. We prove that for certain classes of environments, this approach outputs, with high probability, a state abstraction corresponding to the causal feature set with respect to the return. We give empirical evidence that analogous methods for the nonlinear setting can also attain improved generalization over single- and multi-task baselines. Lastly, we provide bounds on model generalization error in the multi-environment setting, in the process showing a connection between causal variable identification and the state abstraction framework for MDPs. |
spellingShingle | Zhang, A Lyle, C Sodhani, S Filos, A Kwiatkowska, M Pineau, J Gal, Y Precup, D Invariant causal prediction for block MDPs |
title | Invariant causal prediction for block MDPs |
title_full | Invariant causal prediction for block MDPs |
title_fullStr | Invariant causal prediction for block MDPs |
title_full_unstemmed | Invariant causal prediction for block MDPs |
title_short | Invariant causal prediction for block MDPs |
title_sort | invariant causal prediction for block mdps |
work_keys_str_mv | AT zhanga invariantcausalpredictionforblockmdps AT lylec invariantcausalpredictionforblockmdps AT sodhanis invariantcausalpredictionforblockmdps AT filosa invariantcausalpredictionforblockmdps AT kwiatkowskam invariantcausalpredictionforblockmdps AT pineauj invariantcausalpredictionforblockmdps AT galy invariantcausalpredictionforblockmdps AT precupd invariantcausalpredictionforblockmdps |