Deep variational reinforcement learning for POMDPs

Deep variational reinforcement learning for POMDPs

Many real-world sequential decision making problems are partially observable by nature, and the environment model is typically unknown. Consequently, there is great need for reinforcement learning methods that can tackle such problems given only a stream of incomplete and noisy observations. In this...

Volledige beschrijving

Bibliografische gegevens
Hoofdauteurs:	Igl, M, Zintgraf, L, Le, T, Wood, F, Whiteson, S
Formaat:	Conference item
Gepubliceerd in:	Journal of Machine Learning Research 2018

Gelijkaardige items

Exploration in approximate hyper-state space for meta reinforcement learning
door: Zintgraf, L, et al.
Gepubliceerd in: (2021)

Reinforcement learning with limited reinforcement: Using Bayes risk for active learning in POMDPs
door: Pineau, Joelle, et al.
Gepubliceerd in: (2017)

VariBAD: a very good method for Bayes-adaptive deep RL via meta-learning
door: Zintgraf, L, et al.
Gepubliceerd in: (2020)

Multi-Agent Active Perception Based on Reinforcement Learning and POMDP
door: Tarik Selimovic, et al.
Gepubliceerd in: (2024-01-01)

TreeQN and ATreeC: differentiable tree planning for deep reinforcement learning
door: Farquhar, G, et al.
Gepubliceerd in: (2018)

Transient non−stationarity and generalisation in deep reinforcement learning
door: Igl, M, et al.
Gepubliceerd in: (2021)

Incremental Clustering and Expansion for Faster Optimal Planning in Dec-POMDPs
door: Oliehoek, Frans A., et al.
Gepubliceerd in: (2013)

Stick-breaking policy learning in Dec-POMDPs
door: Amato, Christopher, et al.
Gepubliceerd in: (2016)

Inductive biases and generalisation for deep reinforcement learning
door: Igl, M
Gepubliceerd in: (2021)

Fast adaptation via meta reinforcement learning
door: Zintgraf, L
Gepubliceerd in: (2022)

An online algorithm for constrained POMDPs
door: Undurti, Aditya, et al.
Gepubliceerd in: (2011)

Improved Deep Recurrent Q-Network of POMDPs for Automated Penetration Testing
door: Yue Zhang, et al.
Gepubliceerd in: (2022-10-01)

Monte-Carlo planning in large POMDPs
door: Silver, David, et al.
Gepubliceerd in: (2015)

Planning with Macro-Actions in Decentralized POMDPs
door: Amato, Christopher, et al.
Gepubliceerd in: (2016)

RAO*: an Algorithm for Chance-Constrained POMDP’s
door: Santana, Pedro, et al.
Gepubliceerd in: (2016)

Safe POMDP online planning via shielding
door: Sheng, S, et al.
Gepubliceerd in: (2024)

Modeling and Planning with Macro-Actions in Decentralized POMDPs
door: Amato, Christopher, et al.
Gepubliceerd in: (2021)

Sampling-based algorithms for continuous-time POMDPs
door: Chaudhari, Pratik Anil, et al.
Gepubliceerd in: (2013)

Trust oriented decision making via POMDPs
door: Aravazhi Irissappane, Athirai
Gepubliceerd in: (2016)

Policy Evaluation in Decentralized POMDPs With Belief Sharing
door: Mert Kayaalp, et al.
Gepubliceerd in: (2023-01-01)

DGA domain detection and botnet prevention using Q-learning for POMDP
door: Y. V. Bubnov, et al.
Gepubliceerd in: (2021-03-01)

Policy Improvement for POMDPs Using Normalized Importance Sampling
door: Shelton, Christian R.
Gepubliceerd in: (2004)

Spatial and Temporal Abstractions in POMDPs Applied to Robot Navigation
door: Theocharous, Georgios, et al.
Gepubliceerd in: (2005)

A POMDP Approach to Map Victims in Disaster Scenarios
door: Pedro Gabriel Villani, et al.
Gepubliceerd in: (2024-11-01)

Spectrum Access Algoritbm Based on POMDP Model in CVANET
door: Xuefei Zhang, et al.
Gepubliceerd in: (2014-09-01)

Spectrum Access Algoritbm Based on POMDP Model in CVANET
door: Xuefei Zhang, et al.
Gepubliceerd in: (2014-09-01)

Bottom-up learning of hierarchical models in a class of deterministic POMDP environments
door: Itoh Hideaki, et al.
Gepubliceerd in: (2015-09-01)

Deep residual reinforcement learning
door: Zhang, S, et al.
Gepubliceerd in: (2020)

Efficient POMDP Forward Search by Predicting the Posterior Belief Distribution
door: Roy, Nicholas, et al.
Gepubliceerd in: (2009)

Interference Coordination Based on POMDP in Multi-Cell OFDMA System
door: Qiang Wei, et al.
Gepubliceerd in: (2013-04-01)

Cognitive radio auto-adaptive sensing algorithm based on POMDP
door: Rui-chen XU, et al.
Gepubliceerd in: (2013-06-01)

Interference Coordination Based on POMDP in Multi-Cell OFDMA System
door: Qiang Wei, et al.
Gepubliceerd in: (2013-04-01)

Cognitive radio auto-adaptive sensing algorithm based on POMDP
door: Rui-chen XU, et al.
Gepubliceerd in: (2013-06-01)

Point-Based Policy Transformation: Adapting Policy to Changing POMDP Models
door: Kurniawati, Hanna, et al.
Gepubliceerd in: (2019)

Recent Advances in Deep Reinforcement Learning Applications for Solving Partially Observable Markov Decision Processes (POMDP) Problems Part 2—Applications in Transportation, Industries, Communications and Networking and More Topics
door: Xuanchen Xiang, et al.
Gepubliceerd in: (2021-10-01)

Recent Advances in Deep Reinforcement Learning Applications for Solving Partially Observable Markov Decision Processes (POMDP) Problems: Part 1—Fundamentals and Applications in Games, Robotics and Natural Language Processing
door: Xuanchen Xiang, et al.
Gepubliceerd in: (2021-07-01)

CAR-DESPOT: causally-informed online POMDP planning for robots in confounded environments
door: Cannizzaro, R, et al.
Gepubliceerd in: (2023)

DualSMC: Tunneling Differentiable Filtering and Planning under Continuous POMDPs
door: Wang, Yunbo, et al.
Gepubliceerd in: (2021)

Personalized Cotesting Policies for Cervical Cancer Screening: A POMDP Approach
door: Malek Ebadi, et al.
Gepubliceerd in: (2021-03-01)

A POMDP Framework for Coordinated Guidance of Autonomous UAVs for Multitarget Tracking
Gepubliceerd in: (2009-03-01)