Information asymmetry in KL-regularized RL
Many real world tasks exhibit rich structure that is repeated across different parts of the state space or in time. In this work we study the possibility of leveraging such repeated structure to speed up and regularize learning. We start from the KL regularized expected reward objective which introd...
Main Authors: | , , , , , , , , , |
---|---|
Format: | Journal article |
Language: | English |
Published: |
International Conference on Learning Representations
2018
|