Anfonwch hwn fel neges destun: Robust reinforcement learning with Bayesian optimisation and quadrature