Markov decision processes with observation costs: framework and computation with a penalty scheme
We consider Markov decision processes where the state of the chain is only given at chosen observation times and of a cost. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We consider the finite horizon and discounted infinite horizon problem...
Main Authors: | , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
INFORMS
2024
|