Learning to Plan via Deep Optimistic Value Exploration
Deep exploration requires coordinated long-term planning. We present a model-based reinforcement learning algorithm that guides policy learning through a value function that exhibits optimism in the face of uncertainty. We capture uncertainty over values by combining predictions from an ensemble o...
Main Authors: | Seyde, Tim, Schwarting, Wilko, Karaman, Sertac, Rus, Daniela L |
---|---|
Other Authors: | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory |
Format: | Article |
Published: |
2020
|
Online Access: | https://hdl.handle.net/1721.1/125161 |
Similar Items
-
Stochastic Dynamic Games in Belief Space
by: Schwarting, Wilko, et al.
Published: (2022) -
Semi-Cooperative Control for Autonomous Emergency Vehicles
by: Buckman, Noam, et al.
Published: (2022) -
Sharing is Caring: Socially-Compliant Autonomous Intersection Negotiation
by: Buckman, Noam, et al.
Published: (2020) -
Social behavior for autonomous vehicles
by: Schwarting, Wilko, et al.
Published: (2021) -
Parallel Autonomy in Automated Vehicles: Safe Motion Generation with Minimal Intervention
by: Schwarting, Wilko, et al.
Published: (2017)