Efficient Hindsight Experience Replay with Transformed Data Augmentation

Motion control of robots is a high-dimensional, nonlinear control problem that is often difficult to handle using traditional dynamical path planning means. Reinforcement learning is currently an effective means to solve robot motion control problems, but reinforcement learning has disadvantages suc...

Full description

Bibliographic Details
Main Authors: Jiazheng Sun, Weiguang Li
Format: Article
Language:English
Published: Tamkang University Press 2023-08-01
Series:Journal of Applied Science and Engineering
Subjects:
Online Access:http://jase.tku.edu.tw/articles/jase-202402-27-2-0011
_version_ 1797740144078356480
author Jiazheng Sun
Weiguang Li
author_facet Jiazheng Sun
Weiguang Li
author_sort Jiazheng Sun
collection DOAJ
description Motion control of robots is a high-dimensional, nonlinear control problem that is often difficult to handle using traditional dynamical path planning means. Reinforcement learning is currently an effective means to solve robot motion control problems, but reinforcement learning has disadvantages such as high number of trials and errors and sparse rewards, which restrict the application efficiency of reinforcement learning. The Hindsight Experience Replay(HER) algorithm is a reinforcement learning algorithm that solves the reward sparsity problem by constructing virtual target values. However, the HER algorithm still suffers from the problem of long time in the early stage of training, and there is still room for improving its sample utilization efficiency. Augmentation by existing data to improve training efficiency has been widely used in supervised learning, but is less applied in the field of reinforcement learning. In this paper, we propose the Hindsight Experience Replay with Transformed Data Augmentation (TDAHER) algorithm by constructing a transformed data augmentation method for reinforcement learning samples, combined with the HER algorithm. And in order to solve the problem of the accuracy of the augmented samples in the later stage of training, the decaying participation factor method is introduced. After the comparison of four simulated robot control tasks, it is proved that the algorithm can effectively improve the training efficiency of reinforcement learning.
first_indexed 2024-03-12T14:08:23Z
format Article
id doaj.art-0cdf88963f604ad8afa77aa84935ccd7
institution Directory Open Access Journal
issn 2708-9967
2708-9975
language English
last_indexed 2024-03-12T14:08:23Z
publishDate 2023-08-01
publisher Tamkang University Press
record_format Article
series Journal of Applied Science and Engineering
spelling doaj.art-0cdf88963f604ad8afa77aa84935ccd72023-08-21T10:31:09ZengTamkang University PressJournal of Applied Science and Engineering2708-99672708-99752023-08-012712097210810.6180/jase.202402_27(2).0011Efficient Hindsight Experience Replay with Transformed Data AugmentationJiazheng Sun0Weiguang Li1School of Mechanical and Automotive Engineering, South China University of Technology Guangzhou, Guangdong, ChinaSchool of Mechanical and Automotive Engineering, South China University of Technology Guangzhou, Guangdong, ChinaMotion control of robots is a high-dimensional, nonlinear control problem that is often difficult to handle using traditional dynamical path planning means. Reinforcement learning is currently an effective means to solve robot motion control problems, but reinforcement learning has disadvantages such as high number of trials and errors and sparse rewards, which restrict the application efficiency of reinforcement learning. The Hindsight Experience Replay(HER) algorithm is a reinforcement learning algorithm that solves the reward sparsity problem by constructing virtual target values. However, the HER algorithm still suffers from the problem of long time in the early stage of training, and there is still room for improving its sample utilization efficiency. Augmentation by existing data to improve training efficiency has been widely used in supervised learning, but is less applied in the field of reinforcement learning. In this paper, we propose the Hindsight Experience Replay with Transformed Data Augmentation (TDAHER) algorithm by constructing a transformed data augmentation method for reinforcement learning samples, combined with the HER algorithm. And in order to solve the problem of the accuracy of the augmented samples in the later stage of training, the decaying participation factor method is introduced. After the comparison of four simulated robot control tasks, it is proved that the algorithm can effectively improve the training efficiency of reinforcement learning.http://jase.tku.edu.tw/articles/jase-202402-27-2-0011reinforcement learningmachine learningmotion controldata augmentationcomponent
spellingShingle Jiazheng Sun
Weiguang Li
Efficient Hindsight Experience Replay with Transformed Data Augmentation
Journal of Applied Science and Engineering
reinforcement learning
machine learning
motion control
data augmentation
component
title Efficient Hindsight Experience Replay with Transformed Data Augmentation
title_full Efficient Hindsight Experience Replay with Transformed Data Augmentation
title_fullStr Efficient Hindsight Experience Replay with Transformed Data Augmentation
title_full_unstemmed Efficient Hindsight Experience Replay with Transformed Data Augmentation
title_short Efficient Hindsight Experience Replay with Transformed Data Augmentation
title_sort efficient hindsight experience replay with transformed data augmentation
topic reinforcement learning
machine learning
motion control
data augmentation
component
url http://jase.tku.edu.tw/articles/jase-202402-27-2-0011
work_keys_str_mv AT jiazhengsun efficienthindsightexperiencereplaywithtransformeddataaugmentation
AT weiguangli efficienthindsightexperiencereplaywithtransformeddataaugmentation