Towards feature selection in actor-critic algorithms

URL to paper listed on conference page

Bibliographic Details
Main Authors: Rohanimanesh, Khashayar, Roy, Nicholas, Tedrake, Russell Louis
Other Authors: Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Format: Article
Language:en_US
Published: 2011
Online Access:http://hdl.handle.net/1721.1/64445
https://orcid.org/0000-0002-8712-7092
https://orcid.org/0000-0002-8293-0492
_version_ 1826205773601964032
author Rohanimanesh, Khashayar
Roy, Nicholas
Tedrake, Russell Louis
author2 Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
author_facet Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Rohanimanesh, Khashayar
Roy, Nicholas
Tedrake, Russell Louis
author_sort Rohanimanesh, Khashayar
collection MIT
description URL to paper listed on conference page
first_indexed 2024-09-23T13:18:52Z
format Article
id mit-1721.1/64445
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T13:18:52Z
publishDate 2011
record_format dspace
spelling mit-1721.1/644452022-09-28T13:20:13Z Towards feature selection in actor-critic algorithms Rohanimanesh, Khashayar Roy, Nicholas Tedrake, Russell Louis Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Tedrake, Russell Louis Tedrake, Russell Louis Roy, Nicholas URL to paper listed on conference page Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a wellstudied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum. 2011-06-15T19:54:57Z 2011-06-15T19:54:57Z 2009-06 Article http://purl.org/eprint/type/ConferencePaper http://hdl.handle.net/1721.1/64445 Rohanimanesh, Khashayar, Nicholas Roy and Russ Tedrake. "Towards feature selection in actor-critic algorithms." in Proceedings of the ICML/UAI/COLT Workshop on Abstraction in Reinforcement Learning, Montreal, Canada, 2009. https://orcid.org/0000-0002-8712-7092 https://orcid.org/0000-0002-8293-0492 en_US http://www-all.cs.umass.edu/~gdk/arl/papers.html Proceedings of Workshop on Abstraction in Reinforcement Learning, Joint workshop at ICML, UAI, and COLT 2009 Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf MIT web domain
spellingShingle Rohanimanesh, Khashayar
Roy, Nicholas
Tedrake, Russell Louis
Towards feature selection in actor-critic algorithms
title Towards feature selection in actor-critic algorithms
title_full Towards feature selection in actor-critic algorithms
title_fullStr Towards feature selection in actor-critic algorithms
title_full_unstemmed Towards feature selection in actor-critic algorithms
title_short Towards feature selection in actor-critic algorithms
title_sort towards feature selection in actor critic algorithms
url http://hdl.handle.net/1721.1/64445
https://orcid.org/0000-0002-8712-7092
https://orcid.org/0000-0002-8293-0492
work_keys_str_mv AT rohanimaneshkhashayar towardsfeatureselectioninactorcriticalgorithms
AT roynicholas towardsfeatureselectioninactorcriticalgorithms
AT tedrakerusselllouis towardsfeatureselectioninactorcriticalgorithms