Deep Reinforcement Learning with Phasic Policy Gradient with Sample Reuse

Purposes The algoritihm of phasic policy gradient with sample reuse (SR-PPG) is proposed to address the problems of non-reuse of samples and low sample utilization in policybased deep reinforcement learning algorithms. Methods In the proposed algorithm, offline data are introduced on the basis of th...

Full description

Bibliographic Details
Main Authors: LI Hailiang, WANG Li
Format: Article
Language:English
Published: Editorial Office of Journal of Taiyuan University of Technology 2024-07-01
Series:Taiyuan Ligong Daxue xuebao
Subjects:
Online Access:https://tyutjournal.tyut.edu.cn/englishpaper/show-2319.html