Markov decision processes with observation costs: framework and computation with a penalty scheme
We consider Markov decision processes where the state of the chain is only given at chosen observation times and of a cost. Optimal strategies involve the optimisation of observation times as well as the subsequent action values. We consider the finite horizon and discounted infinite horizon problem...
Главные авторы: | , |
---|---|
Формат: | Journal article |
Язык: | English |
Опубликовано: |
INFORMS
2024
|