Learning to Plan via Deep Optimistic Value Exploration

Deep exploration requires coordinated long-term planning. We present a model-based reinforcement learning algorithm that guides policy learning through a value function that exhibits optimism in the face of uncertainty. We capture uncertainty over values by combining predictions from an ensemble o...

Full description

Bibliographic Details
Main Authors:	Seyde, Tim, Schwarting, Wilko, Karaman, Sertac, Rus, Daniela L
Other Authors:	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
Format:	Article
Published:	2020
Online Access:	https://hdl.handle.net/1721.1/125161

Internet

https://hdl.handle.net/1721.1/125161

Learning to Plan via Deep Optimistic Value Exploration

Internet

Similar Items