Provably efficient learning with typed parametric models

To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences....

Full description

Bibliographic Details
Main Authors:	Brunskill, Emma, Leffler, Bethany R., Li, Lihong, Littman, Michael L., Roy, Nicholas
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Language:	en_US
Published:	Journal of Machine Learning Research 2010
Online Access:	http://hdl.handle.net/1721.1/60042 https://orcid.org/0000-0002-8293-0492

_version_	1826189182967480320
author	Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas
author2	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
author_facet	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas
author_sort	Brunskill, Emma
collection	MIT
description	To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences. Our algorithm balances this tradeoff by using a stochastic, switching, parametric dynamics representation. We argue that this model characterizes a number of significant, real-world domains, such as robot navigati on across varying terrain. We prove that this representational assumption allows our algorithm to be probably approximately correct with a sample complexity that scales polynomially with all problem-specific quantities including the state-space dimension. We also explicitly incorporate the error introduced by approximate planning in our sample complexity bounds, in contrast to prior Probably Approximately Correct (PAC) Markov Decision Processes (MDP) approaches, which typically assume the estimated MDP can be solved exactly. Our experimental results on constructing plans for driving to work using real car trajectory data, as well as a small robot experiment on navigating varying terrain, demonstrate that our dynamics representation enables us to capture real-world dynamics in a sufficient manner to produce good performance.
first_indexed	2024-09-23T08:10:50Z
format	Article
id	mit-1721.1/60042
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T08:10:50Z
publishDate	2010
publisher	Journal of Machine Learning Research
record_format	dspace
spelling	mit-1721.1/600422022-09-30T08:06:18Z Provably efficient learning with typed parametric models Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Roy, Nicholas Roy, Nicholas Brunskill, Emma To quickly achieve good performance, reinforcement-learning algorithms for acting in large continuous-valued domains must use a representation that is both sufficiently powerful to capture important domain characteristics, and yet simultaneously allows generalization, or sharing, among experiences. Our algorithm balances this tradeoff by using a stochastic, switching, parametric dynamics representation. We argue that this model characterizes a number of significant, real-world domains, such as robot navigati on across varying terrain. We prove that this representational assumption allows our algorithm to be probably approximately correct with a sample complexity that scales polynomially with all problem-specific quantities including the state-space dimension. We also explicitly incorporate the error introduced by approximate planning in our sample complexity bounds, in contrast to prior Probably Approximately Correct (PAC) Markov Decision Processes (MDP) approaches, which typically assume the estimated MDP can be solved exactly. Our experimental results on constructing plans for driving to work using real car trajectory data, as well as a small robot experiment on navigating varying terrain, demonstrate that our dynamics representation enables us to capture real-world dynamics in a sufficient manner to produce good performance. 2010-11-29T17:59:03Z 2010-11-29T17:59:03Z 2009-08 2009-03 Article http://purl.org/eprint/type/JournalArticle 1532-4435 1533-7928 http://hdl.handle.net/1721.1/60042 Brunskill, Emma et al. "Provably Efficient Learning with Typed Parametric Models." Journal of Machine Learning Research, 10 (December 2009), 1955-1988. https://orcid.org/0000-0002-8293-0492 en_US http://dx.doi.org/10.1145/1577069.1755851 Journal of Machine Learning Research Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf Journal of Machine Learning Research N. Roy via Barbara Williams
spellingShingle	Brunskill, Emma Leffler, Bethany R. Li, Lihong Littman, Michael L. Roy, Nicholas Provably efficient learning with typed parametric models
title	Provably efficient learning with typed parametric models
title_full	Provably efficient learning with typed parametric models
title_fullStr	Provably efficient learning with typed parametric models
title_full_unstemmed	Provably efficient learning with typed parametric models
title_short	Provably efficient learning with typed parametric models
title_sort	provably efficient learning with typed parametric models
url	http://hdl.handle.net/1721.1/60042 https://orcid.org/0000-0002-8293-0492
work_keys_str_mv	AT brunskillemma provablyefficientlearningwithtypedparametricmodels AT lefflerbethanyr provablyefficientlearningwithtypedparametricmodels AT lilihong provablyefficientlearningwithtypedparametricmodels AT littmanmichaell provablyefficientlearningwithtypedparametricmodels AT roynicholas provablyefficientlearningwithtypedparametricmodels

Provably efficient learning with typed parametric models

Similar Items