Expected policy gradients for reinforcement learning

Expected policy gradients for reinforcement learning

We propose expected policy gradients (EPG), which unify stochastic policy gradients (SPG) and deterministic policy gradients (DPG) for reinforcement learning. Inspired by expected sarsa, EPG integrates (or sums) across actions when estimating the gradient, instead of relying only on the action in th...

Full description

Bibliographic Details
Main Authors:	Ciosek, K, Whiteson, S
Format:	Journal article
Language:	English
Published:	Journal of Machine Learning Research 2020

Similar Items

Expected policy gradients
by: Ciosek, K, et al.
Published: (2018)

Fourier policy gradients
by: Fellows, M, et al.
Published: (2018)

OFFER: Off-environment reinforcement learning
by: Ciosek, K, et al.
Published: (2017)

Robust reinforcement learning with Bayesian optimisation and quadrature
by: Paul, S, et al.
Published: (2020)

Alternating optimisation and quadrature for robust control
by: Paul, S, et al.
Published: (2018)

A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning
by: Kim, Dong-Ki, et al.
Published: (2022)

Deep reinforcement learning with robust deep deterministic policy gradient
by: Teckchai Tiong, et al.
Published: (2020)

Fingerprint policy optimisation for robust reinforcement learning
by: Paul, S, et al.
Published: (2019)

Counterfactual multi−agent policy gradients
by: Foerster, J, et al.
Published: (2018)

Fast efficient hyperparameter tuning for policy gradient methods
by: Paul, S, et al.
Published: (2019)

Mean−variance policy iteration for risk−averse reinforcement learning
by: Zhang, S, et al.
Published: (2021)

Exploration in Gradient-Based Reinforcement Learning
by: Meuleau, Nicolas, et al.
Published: (2004)

Loaded DiCE: Trading off bias and variance in any-order score function gradient estimators for reinforcement learning
by: Farquhar, G, et al.
Published: (2019)

Inverse reinforcement learning from failure
by: Shiarlis, K, et al.
Published: (2016)

Distributed Bayesian learning with stochastic natural gradient expectation propagation and the posterior server
by: Hasenclver, L, et al.
Published: (2017)

FACMAC: Factored multi−agent centralised policy gradients
by: Peng, B, et al.
Published: (2022)

Multileave gradient descent for fast online learning to rank
by: Whiteson, S, et al.
Published: (2016)

Deep residual reinforcement learning
by: Zhang, S, et al.
Published: (2020)

Learning retrospective knowledge with reverse reinforcement learning
by: Zhang, S, et al.
Published: (2020)

Bayesian action decoder for deep multi-agent reinforcement learning
by: Whiteson, S
Published: (2019)

Reinforcement Learning by Policy Search
by: Peshkin, Leonid
Published: (2004)

Learning to communicate with Deep multi-agent reinforcement learning
by: Foerster, J, et al.
Published: (2016)

Deep variational reinforcement learning for POMDPs
by: Igl, M, et al.
Published: (2018)

VIREL: A variational inference framework for reinforcement learning
by: Fellows, M, et al.
Published: (2019)

GradientDICE: rethinking generalized offline estimation of stationary values
by: Zhang, S, et al.
Published: (2020)

Stabilization Policy, Expected Output and Employment.
by: Bond, S
Published: (1988)

On Expectations, Government Policy and the Rate of Investment.
by: Nickell, S
Published: (1974)

Exploration in approximate hyper-state space for meta reinforcement learning
by: Zintgraf, L, et al.
Published: (2021)

Learning and expectations in macroeconomics /
by: Evans, George W., 1949-, et al.
Published: (2001)

Verifiable reinforcement learning via policy extraction
by: Solar Lezama, Armando, et al.
Published: (2021)

Verified probabilistic policies for deep reinforcement learning
by: Bacci, E, et al.
Published: (2022)

Off-policy reinforcement learning with Gaussian processes
by: Chowdhary, Girish, et al.
Published: (2015)

Nonparametric Bayesian Policy Priors for Reinforcement Learning
by: Doshi-Velez, Finale P., et al.
Published: (2011)

Transient non−stationarity and generalisation in deep reinforcement learning
by: Igl, M, et al.
Published: (2021)

Multi-agent common knowledge reinforcement learning
by: de Witt, C, et al.
Published: (2019)

Policy gradient methods for linear quadratic problems
by: Yang, H
Published: (2022)

TreeQN and ATreeC: differentiable tree planning for deep reinforcement learning
by: Farquhar, G, et al.
Published: (2018)

Inflation-Target Expectations and Optimal Monetary Policy.
by: Kapadia, S
Published: (2005)

Inflation-target expectations and optimal monetary policy
by: Kapadia, S
Published: (2005)

Reinforcement learning enhanced quantum-inspired algorithm for combinatorial optimization
by: Beloborodov, D, et al.
Published: (2020)