Maximizing Information Gain in Partially Observable Environments via Prediction Rewards
Information gathering in a partially observable environment can be formulated as a reinforcement learning (RL), problem where the reward depends on the agent’s uncertainty. For example, the reward can be the negative entropy of the agent’s belief over an unknown (or hidden) variable. Typically, the...
Main Authors: | Lim, S, Satsangi, Y, Whiteson, S, Oliehoek, FA, White, M |
---|---|
Format: | Conference item |
Language: | English |
Published: |
2020
|
Similar Items
-
PAC greedy maximization with efficient bounds on information gain for sensor selection
by: Whiteson, S, et al.
Published: (2016) -
Exploiting submodular value functions for scaling up active perception
by: Satsangi, Y, et al.
Published: (2017) -
Real−time resource allocation for tracking systems
by: Satsangi, Y, et al.
Published: (2017) -
Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning
by: Castellini, J, et al.
Published: (2021) -
Sorting Objects from a Conveyor Belt Using POMDPs with Multiple-Object Observations and Information-Gain Rewards
by: Ady-Daniel Mezei, et al.
Published: (2020-04-01)