Quickest change detection approach to optimal control in Markov decision processes with model changes
Optimal control in non-stationary Markov decision processes (MDP) is a challenging problem. The aim in such a control problem is to maximize the long-term discounted reward when the transition dynamics or the reward function can change over time. When a prior knowledge of change statistics is availa...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Published: |
Institute of Electrical and Electronics Engineers (IEEE)
2018
|
Online Access: | http://hdl.handle.net/1721.1/114735 https://orcid.org/0000-0002-1648-8325 https://orcid.org/0000-0001-8576-1930 |