Stochastic Double Deep Q-Network

Estimation bias seriously affects the performance of reinforcement learning algorithms. The maximum operation may result in overestimation, while the double estimator operation often leads to underestimation. To eliminate the estimation bias, these two operations are combined together in our propose...

Full description

Bibliographic Details
Main Authors: Pingli Lv, Xuesong Wang, Yuhu Cheng, Ziming Duan
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8736298/