Towards feature selection in actor-critic algorithms
URL to paper listed on conference page
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
2011
|
Online Access: | http://hdl.handle.net/1721.1/64445 https://orcid.org/0000-0002-8712-7092 https://orcid.org/0000-0002-8293-0492 |
_version_ | 1826205773601964032 |
---|---|
author | Rohanimanesh, Khashayar Roy, Nicholas Tedrake, Russell Louis |
author2 | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics |
author_facet | Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Rohanimanesh, Khashayar Roy, Nicholas Tedrake, Russell Louis |
author_sort | Rohanimanesh, Khashayar |
collection | MIT |
description | URL to paper listed on conference page |
first_indexed | 2024-09-23T13:18:52Z |
format | Article |
id | mit-1721.1/64445 |
institution | Massachusetts Institute of Technology |
language | en_US |
last_indexed | 2024-09-23T13:18:52Z |
publishDate | 2011 |
record_format | dspace |
spelling | mit-1721.1/644452022-09-28T13:20:13Z Towards feature selection in actor-critic algorithms Rohanimanesh, Khashayar Roy, Nicholas Tedrake, Russell Louis Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science Tedrake, Russell Louis Tedrake, Russell Louis Roy, Nicholas URL to paper listed on conference page Choosing features for the critic in actor-critic algorithms with function approximation is known to be a challenge. Too few critic features can lead to degeneracy of the actor gradient, and too many features may lead to slower convergence of the learner. In this paper, we show that a wellstudied class of actor policies satisfy the known requirements for convergence when the actor features are selected carefully. We demonstrate that two popular representations for value methods - the barycentric interpolators and the graph Laplacian proto-value functions - can be used to represent the actor in order to satisfy these conditions. A consequence of this work is a generalization of the proto-value function methods to the continuous action actor-critic domain. Finally, we analyze the performance of this approach using a simulation of a torque-limited inverted pendulum. 2011-06-15T19:54:57Z 2011-06-15T19:54:57Z 2009-06 Article http://purl.org/eprint/type/ConferencePaper http://hdl.handle.net/1721.1/64445 Rohanimanesh, Khashayar, Nicholas Roy and Russ Tedrake. "Towards feature selection in actor-critic algorithms." in Proceedings of the ICML/UAI/COLT Workshop on Abstraction in Reinforcement Learning, Montreal, Canada, 2009. https://orcid.org/0000-0002-8712-7092 https://orcid.org/0000-0002-8293-0492 en_US http://www-all.cs.umass.edu/~gdk/arl/papers.html Proceedings of Workshop on Abstraction in Reinforcement Learning, Joint workshop at ICML, UAI, and COLT 2009 Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf MIT web domain |
spellingShingle | Rohanimanesh, Khashayar Roy, Nicholas Tedrake, Russell Louis Towards feature selection in actor-critic algorithms |
title | Towards feature selection in actor-critic algorithms |
title_full | Towards feature selection in actor-critic algorithms |
title_fullStr | Towards feature selection in actor-critic algorithms |
title_full_unstemmed | Towards feature selection in actor-critic algorithms |
title_short | Towards feature selection in actor-critic algorithms |
title_sort | towards feature selection in actor critic algorithms |
url | http://hdl.handle.net/1721.1/64445 https://orcid.org/0000-0002-8712-7092 https://orcid.org/0000-0002-8293-0492 |
work_keys_str_mv | AT rohanimaneshkhashayar towardsfeatureselectioninactorcriticalgorithms AT roynicholas towardsfeatureselectioninactorcriticalgorithms AT tedrakerusselllouis towardsfeatureselectioninactorcriticalgorithms |