Maximizing Information Gain in Partially Observable Environments via Prediction Rewards

Maximizing Information Gain in Partially Observable Environments via Prediction Rewards

Information gathering in a partially observable environment can be formulated as a reinforcement learning (RL), problem where the reward depends on the agent’s uncertainty. For example, the reward can be the negative entropy of the agent’s belief over an unknown (or hidden) variable. Typically, the...

Full description

Bibliographic Details
Main Authors:	Lim, S, Satsangi, Y, Whiteson, S, Oliehoek, FA, White, M
Format:	Conference item
Language:	English
Published:	2020

Similar Items

PAC greedy maximization with efficient bounds on information gain for sensor selection
by: Whiteson, S, et al.
Published: (2016)

Exploiting submodular value functions for scaling up active perception
by: Satsangi, Y, et al.
Published: (2017)

Real−time resource allocation for tracking systems
by: Satsangi, Y, et al.
Published: (2017)

Analysing factorizations of action-value networks for cooperative multi-agent reinforcement learning
by: Castellini, J, et al.
Published: (2021)

Sorting Objects from a Conveyor Belt Using POMDPs with Multiple-Object Observations and Information-Gain Rewards
by: Ady-Daniel Mezei, et al.
Published: (2020-04-01)

Discrete Approximate Information States in Partially Observable Environments
by: Yang, Lujie
Published: (2023)

Adaptation and learning as strategies to maximize reward in neurofeedback tasks
by: Rodrigo Osuna-Orozco, et al.
Published: (2024-03-01)

Balancing Teacher Following and Reward Maximization in Reinforcement Learning
by: Shenfeld Amit, Idan
Published: (2024)

Nearly maximal information gain due to time integration in central dogma reactions
by: Swarnavo Sarkar, et al.
Published: (2023-06-01)

Using informative behavior to increase engagement while learning from human reward
by: Li, Guangliang, et al.
Published: (2016)

Reward bonuses with gain scheduling inspired by iterative deepening search
by: Taisuke Kobayashi
Published: (2023-09-01)

An Expectation Maximization Algorithm for Continuous Markov Decision Processes with Arbitrary Reward
by: Hoffman, M, et al.
Published: (2009)

An expectation maximization algorithm for continuous Markov decision processes with arbitrary rewards
by: Hoffman, M, et al.
Published: (2009)

Information-Guided Robotic Maximum Seek-and-Sample in Partially Observable Continuous Environments
by: Flaspohler, Genevieve, et al.
Published: (2021)

Strategic Sensor Placement in Expansive Highway Networks: A Novel Framework for Maximizing Information Gain
by: Yunxiang Yang, et al.
Published: (2023-12-01)

Relation Representation Learning via Signed Graph Mutual Information Maximization for Trust Prediction
by: Yongjun Jing, et al.
Published: (2021-01-01)

Do humans produce the speed-accuracy trade-off that maximizes reward rate?
by: Bogacz, R, et al.
Published: (2010)

Community detection in hypergraphs via mutual information maximization
by: Jürgen Kritschgau, et al.
Published: (2024-03-01)

Acquiring Classifiers for Bipolarized Reward by XCS in a Continuous Reward Environment
by: Takato Tatsumi, et al.
Published: (2019-05-01)

Average-reward off-policy policy evaluation with function approximation
by: Zhang, S, et al.
Published: (2021)

Tuning the speed-accuracy trade-off to maximize reward rate in multisensory decision-making
by: Jan Drugowitsch, et al.
Published: (2015-06-01)

Introducing upfront losses as well as gains decreases impatience in intertemporal choices with rewards
by: Cheng-Ming Jiang, et al.
Published: (2014-07-01)

Introducing upfront losses as well as gains decreases impatience in intertemporal choices with rewards
by: Cheng-Ming Jiang, et al.
Published: (2014-07-01)

The price of gaining: maximization in decision-making, regret and life satisfaction
by: Emilio Moyano-Díaz, et al.
Published: (2014-09-01)

The price of gaining: maximization in decision-making, regret and life satisfaction
by: Emilio Moyano-Díaz, et al.
Published: (2014-09-01)

Reward is not reward: Differential impacts of primary and secondary rewards on expectation, outcome, and prediction error in the human brain's reward processing regions
by: Martin Ulrich, et al.
Published: (2023-12-01)

Social interaction for efficient agent learning from human reward
by: Li, G, et al.
Published: (2017)

On maximum-reward motion in stochastic environments
by: Ma, Fangchang
Published: (2015)

Multimodal Representation Learning via Maximization of Local Mutual Information
by: Liao, Ruizhi, et al.
Published: (2022)

Maximal randomness from partially entangled states
by: Erik Woodhead, et al.
Published: (2020-11-01)

Social interaction for efficient agent learning from human reward
by: Li, Guangliang, et al.
Published: (2017)

A Sufficient Statistic for Influence in Structured Multiagent Environments
by: Oliehoek, Frans, et al.
Published: (2022)

Maximizing Local Rewards on Multi-Agent Quantum Games through Gradient-Based Learning Strategies
by: Agustin Silva, et al.
Published: (2023-10-01)

Correction: Reward Maximization Justifies the Transition from Sensory Selection at Childhood to Sensory Integration at Adulthood
Published: (2014-01-01)

Validation and Psychometric Properties of the French Versions of the Environmental Reward Observation Scale and of the Reward Probability Index
by: Aurélie Wagener, et al.
Published: (2015-06-01)

Integrating reward information for prospective behaviour
by: Hall-McMaster, S, et al.
Published: (2022)

The effect of noninstrumental information on reward learning
by: Embrey, JR, et al.
Published: (2024)

Graphical Approach to Optimization of Maximally Efficient-Gain-Boosted Feedback Amplifiers
by: Yang Xing, et al.
Published: (2023-06-01)

Sex differences in stretch-induced hypertrophy, maximal strength and flexibility gains
by: Konstantin Warneke, et al.
Published: (2023-01-01)

Efficient enhancement of information in the prefrontal cortex during the presence of reward predicting stimuli.
by: Camilo J Mininni, et al.
Published: (2017-01-01)