Improving the efficiency of Bayesian inverse reinforcement learning

Inverse reinforcement learning (IRL) is the task of learning the reward function of a Markov Decision Process (MDP) given knowledge of the transition function and a set of expert demonstrations. While many IRL algorithms exist, Bayesian IRL [1] provides a general and principled method of reward lear...

Full description

Bibliographic Details
Main Authors:	How, Jonathan P., Michini, Bernard J.
Other Authors:	Massachusetts Institute of Technology. Aerospace Controls Laboratory
Format:	Article
Language:	en_US
Published:	Institute of Electrical and Electronics Engineers (IEEE) 2013
Online Access:	http://hdl.handle.net/1721.1/81489 https://orcid.org/0000-0001-8576-1930

_version_	1811094417692426240
author	How, Jonathan P. Michini, Bernard J.
author2	Massachusetts Institute of Technology. Aerospace Controls Laboratory
author_facet	Massachusetts Institute of Technology. Aerospace Controls Laboratory How, Jonathan P. Michini, Bernard J.
author_sort	How, Jonathan P.
collection	MIT
description	Inverse reinforcement learning (IRL) is the task of learning the reward function of a Markov Decision Process (MDP) given knowledge of the transition function and a set of expert demonstrations. While many IRL algorithms exist, Bayesian IRL [1] provides a general and principled method of reward learning by casting the problem in the Bayesian inference framework. However, the algorithm as originally presented suffers from several inefficiencies that prohibit its use for even moderate problem sizes. This paper proposes modifications to the original Bayesian IRL algorithm to improve its efficiency and tractability in situations where the state space is large and the expert demonstrations span only a small portion of it. The key insight is that the inference task should be focused on states that are similar to those encountered by the expert, as opposed to making the naive assumption that the expert demonstrations contain enough information to accurately infer the reward function over the entire state space. A modified algorithm is presented and experimental results show substantially faster convergence while maintaining the solution quality of the original method.
first_indexed	2024-09-23T15:59:44Z
format	Article
id	mit-1721.1/81489
institution	Massachusetts Institute of Technology
language	en_US
last_indexed	2024-09-23T15:59:44Z
publishDate	2013
publisher	Institute of Electrical and Electronics Engineers (IEEE)
record_format	dspace
spelling	mit-1721.1/814892022-09-29T17:33:07Z Improving the efficiency of Bayesian inverse reinforcement learning How, Jonathan P. Michini, Bernard J. Massachusetts Institute of Technology. Aerospace Controls Laboratory Massachusetts Institute of Technology. Department of Aeronautics and Astronautics Massachusetts Institute of Technology. Laboratory for Information and Decision Systems Michini, Bernard J. How, Jonathan P. Inverse reinforcement learning (IRL) is the task of learning the reward function of a Markov Decision Process (MDP) given knowledge of the transition function and a set of expert demonstrations. While many IRL algorithms exist, Bayesian IRL [1] provides a general and principled method of reward learning by casting the problem in the Bayesian inference framework. However, the algorithm as originally presented suffers from several inefficiencies that prohibit its use for even moderate problem sizes. This paper proposes modifications to the original Bayesian IRL algorithm to improve its efficiency and tractability in situations where the state space is large and the expert demonstrations span only a small portion of it. The key insight is that the inference task should be focused on states that are similar to those encountered by the expert, as opposed to making the naive assumption that the expert demonstrations contain enough information to accurately infer the reward function over the entire state space. A modified algorithm is presented and experimental results show substantially faster convergence while maintaining the solution quality of the original method. United States. Office of Naval Research (Science of Autonomy Program Contract N000140910625)) 2013-10-23T16:56:46Z 2013-10-23T16:56:46Z 2012-05 Article http://purl.org/eprint/type/ConferencePaper 978-1-4673-1405-3 978-1-4673-1403-9 978-1-4673-1578-4 978-1-4673-1404-6 http://hdl.handle.net/1721.1/81489 Michini, Bernard, and Jonathan P. How. “Improving the efficiency of Bayesian inverse reinforcement learning.” In 2012 IEEE International Conference on Robotics and Automation, 3651-3656. Institute of Electrical and Electronics Engineers, 2012. https://orcid.org/0000-0001-8576-1930 en_US http://dx.doi.org/10.1109/ICRA.2012.6225241 Proceedings of the 2012 IEEE International Conference on Robotics and Automation Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/ application/pdf Institute of Electrical and Electronics Engineers (IEEE) MIT web domain
spellingShingle	How, Jonathan P. Michini, Bernard J. Improving the efficiency of Bayesian inverse reinforcement learning
title	Improving the efficiency of Bayesian inverse reinforcement learning
title_full	Improving the efficiency of Bayesian inverse reinforcement learning
title_fullStr	Improving the efficiency of Bayesian inverse reinforcement learning
title_full_unstemmed	Improving the efficiency of Bayesian inverse reinforcement learning
title_short	Improving the efficiency of Bayesian inverse reinforcement learning
title_sort	improving the efficiency of bayesian inverse reinforcement learning
url	http://hdl.handle.net/1721.1/81489 https://orcid.org/0000-0001-8576-1930
work_keys_str_mv	AT howjonathanp improvingtheefficiencyofbayesianinversereinforcementlearning AT michinibernardj improvingtheefficiencyofbayesianinversereinforcementlearning

Improving the efficiency of Bayesian inverse reinforcement learning

Similar Items