Robust reinforcement learning with Bayesian optimisation and quadrature
Bayesian optimisation has been successfully applied to a variety of reinforcement learning problems. However, the traditional approach for learning optimal policies in simulators does not utilise the opportunity to improve learning by adjusting certain environment variables: state features that are...
Main Authors: | Paul, S, Chatzilygeroudis, K, Ciosek, K, Mouret, J-B, Osborne, MA, Whiteson, S |
---|---|
Formato: | Journal article |
Idioma: | English |
Publicado: |
Journal of Machine Learning Research
2020
|
Títulos similares
-
Alternating optimisation and quadrature for robust control
por: Paul, S, et al.
Publicado: (2018) -
Fingerprint policy optimisation for robust reinforcement learning
por: Paul, S, et al.
Publicado: (2019) -
Bayesian Gaussian processes for sequential prediction, optimisation and quadrature
por: Osborne, M, et al.
Publicado: (2010) -
Expected policy gradients for reinforcement learning
por: Ciosek, K, et al.
Publicado: (2020) -
OFFER: Off-environment reinforcement learning
por: Ciosek, K, et al.
Publicado: (2017)