Deep reinforcement learning using least‐squares truncated temporal‐difference

Abstract Policy evaluation (PE) is a critical sub‐problem in reinforcement learning, which estimates the value function for a given policy and can be used for policy improvement. However, there still exist some limitations in current PE methods, such as low sample efficiency and local convergence, e...

Full description

Bibliographic Details
Main Authors:	Junkai Ren, Yixing Lan, Xin Xu, Yichuan Zhang, Qiang Fang, Yujun Zeng
Format:	Article
Language:	English
Published:	Wiley 2024-04-01
Series:	CAAI Transactions on Intelligence Technology
Subjects:	Deep reinforcement learning policy evaluation temporal difference value function approximation
Online Access:	https://doi.org/10.1049/cit2.12202

Internet

https://doi.org/10.1049/cit2.12202

Deep reinforcement learning using least‐squares truncated temporal‐difference

Internet

Similar Items