Provably efficient learning with typed parametric models
To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences....
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Journal of Machine Learning Research
2010
|
Online Access: | http://hdl.handle.net/1721.1/60042 https://orcid.org/0000-0002-8293-0492 |
_version_ | 1826189182967480320 |
---|---|
author | Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas |
author2 | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
author_facet | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas |
author_sort | Brunskill, Emma |
collection | MIT |
description | To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences. Our algorithm balances this tradeoff by using a stochastic, switching, parametric dynamics representation. We argue that this model characterizes a number of significant, real-world domains, such as robot navigati on across varying terrain. We prove that this representational assumption allows our algorithm to be probably approximately correct with a sample complexity that scales polynomially with all problem-specific quantities including the state-space dimension. We also explicitly incorporate the error introduced by approximate planning in our sample complexity bounds, in contrast to prior Probably Approximately Correct (PAC) Markov Decision Processes (MDP) approaches, which typically assume the estimated MDP can be solved exactly. Our experimental results on constructing plans for driving to work using real car trajectory data, as well as a small robot experiment on navigating varying terrain, demonstrate that our dynamics representation enables us to capture real-world dynamics in a sufficient manner to produce good performance. |
first_indexed | 2024-09-23T08:10:50Z |
format | Article |
id | mit-1721.1/60042 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T08:10:50Z |
publishDate | 2010 |
publisher | Journal of Machine Learning Research |
record_format | dspace |
spelling | mit-1721.1/600422022-09-30T08:06:18Z Provably efficient learning with typed parametric models Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Roy, Nicholas Roy, Nicholas Brunskill, Emma To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences. Our algorithm balances this tradeoff by using a stochastic, switching, parametric dynamics representation. We argue that this model characterizes a number of significant, real-world domains, such as robot navigati on across varying terrain. We prove that this representational assumption allows our algorithm to be probably approximately correct with a sample complexity that scales polynomially with all problem-specific quantities including the state-space dimension. We also explicitly incorporate the error introduced by approximate planning in our sample complexity bounds, in contrast to prior Probably Approximately Correct (PAC) Markov Decision Processes (MDP) approaches, which typically assume the estimated MDP can be solved exactly. Our experimental results on constructing plans for driving to work using real car trajectory data, as well as a small robot experiment on navigating varying terrain, demonstrate that our dynamics representation enables us to capture real-world dynamics in a sufficient manner to produce good performance. 2010-11-29T17:59:03Z 2010-11-29T17:59:03Z 2009-08 2009-03 Article http://purl.org/eprint/type/JournalArticle 1532-4435 1533-7928 http://hdl.handle.net/1721.1/60042 Brunskill, Emma et al. "Provably Efficient Learning with Typed Parametric Models." Journal of Machine Learning Research, 10 (December 2009), 1955-1988. https://orcid.org/0000-0002-8293-0492 en_US http://dx.doi.org/10.1145/1577069.1755851 Journal of Machine Learning Research Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Journal of Machine Learning Research N. Roy via Barbara Williams |
spellingShingle | Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas Provably efficient learning with typed parametric models |
title | Provably efficient learning with typed parametric models |
title_full | Provably efficient learning with typed parametric models |
title_fullStr | Provably efficient learning with typed parametric models |
title_full_unstemmed | Provably efficient learning with typed parametric models |
title_short | Provably efficient learning with typed parametric models |
title_sort | provably efficient learning with typed parametric models |
url | http://hdl.handle.net/1721.1/60042 https://orcid.org/0000-0002-8293-0492 |
work_keys_str_mv | AT brunskillemma provablyefficientlearningwithtypedparametricmodels AT lefflerbethanyr provablyefficientlearningwithtypedparametricmodels AT lilihong provablyefficientlearningwithtypedparametricmodels AT littmanmichaell provablyefficientlearningwithtypedparametricmodels AT roynicholas provablyefficientlearningwithtypedparametricmodels |