Representation Discovery for Kernel-Based Reinforcement Learning

Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest...

Full description

Bibliographic Details
Main Authors:	Zewdie, Dawit H., Konidaris, George
Other Authors:	Leslie Kaelbling
Published:	2015
Subjects:	Metric learning
Online Access:	http://hdl.handle.net/1721.1/100053

_version_	1811089012271611904
author	Zewdie, Dawit H. Konidaris, George
author2	Leslie Kaelbling
author_facet	Leslie Kaelbling Zewdie, Dawit H. Konidaris, George
author_sort	Zewdie, Dawit H.
collection	MIT
description	Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest satisfy this requirement in their natural representation. In this paper we define Value-Consistent Pseudometric (VCPM), the distance function corresponding to a transformation of the domain into a space where the target function is maximally smooth and thus well-approximated by kernel regression. We then present DKBRL, an iterative batch RL algorithm interleaving steps of Kernel-Based Reinforcement Learning and distance metric adjustment. We evaluate its performance on Acrobot and PinBall, continuous-space reinforcement learning domains with discontinuous value functions.
first_indexed	2024-09-23T14:11:39Z
id	mit-1721.1/100053
institution	Massachusetts Institute of Technology
last_indexed	2024-09-23T14:11:39Z
publishDate	2015
record_format	dspace
spelling	mit-1721.1/1000532019-04-10T21:28:02Z Representation Discovery for Kernel-Based Reinforcement Learning Zewdie, Dawit H. Konidaris, George Leslie Kaelbling Learning and Intelligent Systems Metric learning Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest satisfy this requirement in their natural representation. In this paper we define Value-Consistent Pseudometric (VCPM), the distance function corresponding to a transformation of the domain into a space where the target function is maximally smooth and thus well-approximated by kernel regression. We then present DKBRL, an iterative batch RL algorithm interleaving steps of Kernel-Based Reinforcement Learning and distance metric adjustment. We evaluate its performance on Acrobot and PinBall, continuous-space reinforcement learning domains with discontinuous value functions. 2015-11-30T19:30:04Z 2015-11-30T19:30:04Z 2015-11-24 2015-11-30T19:30:04Z http://hdl.handle.net/1721.1/100053 MIT-CSAIL-TR-2015-032 Creative Commons Attribution-ShareAlike 4.0 International http://creativecommons.org/licenses/by-sa/4.0/ 16 p. application/pdf
spellingShingle	Metric learning Zewdie, Dawit H. Konidaris, George Representation Discovery for Kernel-Based Reinforcement Learning
title	Representation Discovery for Kernel-Based Reinforcement Learning
title_full	Representation Discovery for Kernel-Based Reinforcement Learning
title_fullStr	Representation Discovery for Kernel-Based Reinforcement Learning
title_full_unstemmed	Representation Discovery for Kernel-Based Reinforcement Learning
title_short	Representation Discovery for Kernel-Based Reinforcement Learning
title_sort	representation discovery for kernel based reinforcement learning
topic	Metric learning
url	http://hdl.handle.net/1721.1/100053
work_keys_str_mv	AT zewdiedawith representationdiscoveryforkernelbasedreinforcementlearning AT konidarisgeorge representationdiscoveryforkernelbasedreinforcementlearning

Representation Discovery for Kernel-Based Reinforcement Learning

Similar Items