Deep Reinforcement Learning with Phasic Policy Gradient with Sample Reuse
Purposes The algoritihm of phasic policy gradient with sample reuse (SR-PPG) is proposed to address the problems of non-reuse of samples and low sample utilization in policybased deep reinforcement learning algorithms. Methods In the proposed algorithm, offline data are introduced on the basis of th...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Editorial Office of Journal of Taiyuan University of Technology
2024-07-01
|
Series: | Taiyuan Ligong Daxue xuebao |
Subjects: | |
Online Access: | https://tyutjournal.tyut.edu.cn/englishpaper/show-2319.html |