The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place

This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one...

Full description

Bibliographic Details
Main Authors:	Byeongjun Kim, Gunam Kwon, Chaneun Park, Nam Kyu Kwon
Format:	Article
Language:	English
Published:	MDPI AG 2023-06-01
Series:	Biomimetics
Subjects:	deep reinforcement learning Soft Actor-Critic Pick-and-Place task decomposition robot manipulator
Online Access:	https://www.mdpi.com/2313-7673/8/2/240

_version_	1827738297686818816
author	Byeongjun Kim Gunam Kwon Chaneun Park Nam Kyu Kwon
author_facet	Byeongjun Kim Gunam Kwon Chaneun Park Nam Kyu Kwon
author_sort	Byeongjun Kim
collection	DOAJ
description	This paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%.
first_indexed	2024-03-11T02:42:24Z
format	Article
id	doaj.art-0669baea245e4893909f2e5ea9db1cb3
institution	Directory Open Access Journal
issn	2313-7673
language	English
last_indexed	2024-03-11T02:42:24Z
publishDate	2023-06-01
publisher	MDPI AG
record_format	Article
series	Biomimetics
spelling	doaj.art-0669baea245e4893909f2e5ea9db1cb32023-11-18T09:29:41ZengMDPI AGBiomimetics2313-76732023-06-018224010.3390/biomimetics8020240The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-PlaceByeongjun Kim0Gunam Kwon1Chaneun Park2Nam Kyu Kwon3Department of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of KoreaDepartment of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of KoreaSchool of Electronics Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaDepartment of Electronic Engineering, Yeungnam University, Gyeongsan 38541, Republic of KoreaThis paper proposes a task decomposition and dedicated reward-system-based reinforcement learning algorithm for the Pick-and-Place task, which is one of the high-level tasks of robot manipulators. The proposed method decomposes the Pick-and-Place task into three subtasks: two reaching tasks and one grasping task. One of the two reaching tasks is approaching the object, and the other is reaching the place position. These two reaching tasks are carried out using each optimal policy of the agents which are trained using Soft Actor-Critic (SAC). Different from the two reaching tasks, the grasping is implemented via simple logic which is easily designable but may result in improper gripping. To assist the grasping task properly, a dedicated reward system for approaching the object is designed through using individual axis-based weights. To verify the validity of the proposed method, wecarry out various experiments in the MuJoCo physics engine with the Robosuite framework. According to the simulation results of four trials, the robot manipulator picked up and released the object in the goal position with an average success rate of 93.2%.https://www.mdpi.com/2313-7673/8/2/240deep reinforcement learningSoft Actor-CriticPick-and-Placetask decompositionrobot manipulator
spellingShingle	Byeongjun Kim Gunam Kwon Chaneun Park Nam Kyu Kwon The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place Biomimetics deep reinforcement learning Soft Actor-Critic Pick-and-Place task decomposition robot manipulator
title	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_full	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_fullStr	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_full_unstemmed	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_short	The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place
title_sort	task decomposition and dedicated reward system based reinforcement learning algorithm for pick and place
topic	deep reinforcement learning Soft Actor-Critic Pick-and-Place task decomposition robot manipulator
url	https://www.mdpi.com/2313-7673/8/2/240
work_keys_str_mv	AT byeongjunkim thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT gunamkwon thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT chaneunpark thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT namkyukwon thetaskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT byeongjunkim taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT gunamkwon taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT chaneunpark taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace AT namkyukwon taskdecompositionanddedicatedrewardsystembasedreinforcementlearningalgorithmforpickandplace

The Task Decomposition and Dedicated Reward-System-Based Reinforcement Learning Algorithm for Pick-and-Place

Similar Items