Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs
Acting in domains where an agent must plan several steps ahead to achieve a goal can be a challenging task, especially if the agentʼs sensors provide only noisy or partial information. In this setting, Partially Observable Markov Decision Processes (POMDPs) provide a planning framework that optimall...
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | en_US |
Published: |
Elsevier
2017
|
Online Access: | http://hdl.handle.net/1721.1/108303 https://orcid.org/0000-0002-8293-0492 |