Quickest change detection approach to optimal control in Markov decision processes with model changes

Optimal control in non-stationary Markov decision processes (MDP) is a challenging problem. The aim in such a control problem is to maximize the long-term discounted reward when the transition dynamics or the reward function can change over time. When a prior knowledge of change statistics is availa...

Full description

Bibliographic Details
Main Authors: Banerjee, Taposh, Liu, Miao, How, Jonathan P
Other Authors: Massachusetts Institute of Technology. Laboratory for Information and Decision Systems
Format: Article
Published: Institute of Electrical and Electronics Engineers (IEEE) 2018
Online Access:http://hdl.handle.net/1721.1/114735
https://orcid.org/0000-0002-1648-8325
https://orcid.org/0000-0001-8576-1930