Consistent Experience Replay in High-Dimensional Continuous Control with Decayed Hindsights
The manipulation of complex robotics, which is in general high-dimensional continuous control without an accurate dynamic model, summons studies and applications of reinforcement learning (RL) algorithms. Typically, RL learns with the objective of maximizing the accumulated rewards from interactions...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-09-01
|
Series: | Machines |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-1702/10/10/856 |