Optimistic exploration even with a pessimistic initialisation

Optimistic initialisation is an effective strategy for efficient exploration in reinforcement learning (RL). In the tabular case, all provably efficient model-free algorithms rely on it. However, model-free deep RL algorithms do not use optimistic initialisation despite taking inspiration from these...

Полное описание

Библиографические подробности
Главные авторы: Whiteson, S, Rashid, T, Peng, B, Bohmer, W
Формат: Conference item
Язык:English
Опубликовано: International Conference on Learning Representations 2020