Information asymmetry in KL-regularized RL

Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introd...

Full description

Bibliographic Details
Main Authors: Galashov, A, Jayakumar, SM, Hasenclever, L, Tirumala, D, Schwarz, J, Desjardins, G, Czarnecki, WM, Teh, YW, Pascanu, R, Heess, N
Format: Journal article
Language:English
Published: International Conference on Learning Representations 2018