The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place

This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one...

Full description

Bibliographic Details
Main Authors: Byeongjun Kim, Gunam Kwon, Chaneun Park, Nam Kyu Kwon
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Biomimetics
Subjects:
Online Access:https://www.mdpi.com/2313-7673/8/2/240
_version_ 1827738297686818816
author Byeongjun Kim
Gunam Kwon
Chaneun Park
Nam Kyu Kwon
author_facet Byeongjun Kim
Gunam Kwon
Chaneun Park
Nam Kyu Kwon
author_sort Byeongjun Kim
collection DOAJ
description This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%.
first_indexed 2024-03-11T02:42:24Z
format Article
id doaj.art-0669baea245e4893909f2e5ea9db1cb3
institution Directory Open Access Journal
issn 2313-7673
language English
last_indexed 2024-03-11T02:42:24Z
publishDate 2023-06-01
publisher MDPI AG
record_format Article
series Biomimetics
spelling doaj.art-0669baea245e4893909f2e5ea9db1cb32023-11-18T09:29:41ZengMDPI AGBiomimetics2313-76732023-06-018224010.3390/biomimetics8020240The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-PlaceByeongjun Kim0Gunam Kwon1Chaneun Park2Nam Kyu Kwon3Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of KoreaDepartment of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of KoreaSchool of Electronics Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaDepartment of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of KoreaThis paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%.https://www.mdpi.com/2313-7673/8/2/240deep reinforcement learningSoft Actor-CriticPick-and-Placetask decompositionrobot manipulator
spellingShingle Byeongjun Kim
Gunam Kwon
Chaneun Park
Nam Kyu Kwon
The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
Biomimetics
deep reinforcement learning
Soft Actor-Critic
Pick-and-Place
task decomposition
robot manipulator
title The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_full The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_fullStr The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_full_unstemmed The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_short The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_sort task decomposition and dedicated reward system based reinforcement learning algorithm for pick and place
topic deep reinforcement learning
Soft Actor-Critic
Pick-and-Place
task decomposition
robot manipulator
url https://www.mdpi.com/2313-7673/8/2/240
work_keys_str_mv AT byeongjunkim thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT gunamkwon thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT chaneunpark thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT namkyukwon thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT byeongjunkim taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT gunamkwon taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT chaneunpark taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace
AT namkyukwon taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace