Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs
Acting in domains where an agent must plan several steps ahead to achieve a goal can be a challenging task, especially if the agentʼs sensors provide only noisy or partial information. In this setting, Partially Observable Markov Decision Processes (POMDPs) provide a planning framework that optimall...
Huvudupphovsmän: | , , |
---|---|
Övriga upphovsmän: | |
Materialtyp: | Artikel |
Språk: | en_US |
Publicerad: |
Elsevier
2017
|
Länkar: | http://hdl.handle.net/1721.1/108303 https://orcid.org/0000-0002-8293-0492 |