Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data

In the industrial domain, estimating the pose of texture-less shiny parts is challenging but worthwhile. In this study, it is impractical to utilize texture information to obtain the pose because the features are likely to be affected by the surrounding objects. In addition, the colors of the metal...

Full description

Bibliographic Details
Main Authors:	Chen Chen, Xin Jiang, Shu Miao, Weiguo Zhou, Yunhui Liu
Format:	Article
Language:	English
Published:	MDPI AG 2022-06-01
Series:	Applied Sciences
Subjects:	synthetic training data shiny object pose estimation single RGB image
Online Access:	https://www.mdpi.com/2076-3417/12/12/6188

_version_	1827662533515804672
author	Chen Chen Xin Jiang Shu Miao Weiguo Zhou Yunhui Liu
author_facet	Chen Chen Xin Jiang Shu Miao Weiguo Zhou Yunhui Liu
author_sort	Chen Chen
collection	DOAJ
description	In the industrial domain, estimating the pose of texture-less shiny parts is challenging but worthwhile. In this study, it is impractical to utilize texture information to obtain the pose because the features are likely to be affected by the surrounding objects. In addition, the colors of the metal parts are similar, making object segmentation challenging. This study proposes dividing the entire process into three steps: object detection, feature extraction, and pose estimation. We use the Mask-RCNN to detect objects and HRNet to extract the corresponding features. For metal parts of different shapes, different keypoints were chosen accordingly. Conventional contour-based methods are inapplicable to parts containing planar surfaces because the objects occlude each other in clustered environments. In this case, we used dense discrete points along the edges as semantic keypoints for metal parts containing planar elements. We chose skeleton points as semantic keypoints for parts containing cylindrical components. Subsequently, we combined the localization of semantic keypoints and the corresponding CAD model information to estimate the 6D pose of an individual object in sight. The implementation of deep learning approaches requires massive training datasets and intensive labeling. Thus, we propose a method to generate training datasets and automatically label them. Experiments show that the algorithm based on synthetic data performs well in a natural environment, despite not utilizing real scenario images for training.
first_indexed	2024-03-10T00:28:21Z
format	Article
id	doaj.art-21032b9d7785420c9540ba17cf25597e
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-10T00:28:21Z
publishDate	2022-06-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-21032b9d7785420c9540ba17cf25597e2023-11-23T15:29:17ZengMDPI AGApplied Sciences2076-34172022-06-011212618810.3390/app12126188Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training DataChen Chen0Xin Jiang1Shu Miao2Weiguo Zhou3Yunhui Liu4Mechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, ChinaMechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, ChinaMechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, ChinaDepartment of Mechanical Engineering, The Chinese University of Hong Kong, Hong Kong, ChinaMechanical Engineering and Automation, Harbin Institute of Technology, Shenzhen 518055, ChinaIn the industrial domain, estimating the pose of texture-less shiny parts is challenging but worthwhile. In this study, it is impractical to utilize texture information to obtain the pose because the features are likely to be affected by the surrounding objects. In addition, the colors of the metal parts are similar, making object segmentation challenging. This study proposes dividing the entire process into three steps: object detection, feature extraction, and pose estimation. We use the Mask-RCNN to detect objects and HRNet to extract the corresponding features. For metal parts of different shapes, different keypoints were chosen accordingly. Conventional contour-based methods are inapplicable to parts containing planar surfaces because the objects occlude each other in clustered environments. In this case, we used dense discrete points along the edges as semantic keypoints for metal parts containing planar elements. We chose skeleton points as semantic keypoints for parts containing cylindrical components. Subsequently, we combined the localization of semantic keypoints and the corresponding CAD model information to estimate the 6D pose of an individual object in sight. The implementation of deep learning approaches requires massive training datasets and intensive labeling. Thus, we propose a method to generate training datasets and automatically label them. Experiments show that the algorithm based on synthetic data performs well in a natural environment, despite not utilizing real scenario images for training.https://www.mdpi.com/2076-3417/12/12/6188synthetic training datashiny object pose estimationsingle RGB image
spellingShingle	Chen Chen Xin Jiang Shu Miao Weiguo Zhou Yunhui Liu Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data Applied Sciences synthetic training data shiny object pose estimation single RGB image
title	Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data
title_full	Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data
title_fullStr	Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data
title_full_unstemmed	Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data
title_short	Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data
title_sort	texture less shiny objects grasping in a single rgb image using synthetic training data
topic	synthetic training data shiny object pose estimation single RGB image
url	https://www.mdpi.com/2076-3417/12/12/6188
work_keys_str_mv	AT chenchen texturelessshinyobjectsgraspinginasinglergbimageusingsynthetictrainingdata AT xinjiang texturelessshinyobjectsgraspinginasinglergbimageusingsynthetictrainingdata AT shumiao texturelessshinyobjectsgraspinginasinglergbimageusingsynthetictrainingdata AT weiguozhou texturelessshinyobjectsgraspinginasinglergbimageusingsynthetictrainingdata AT yunhuiliu texturelessshinyobjectsgraspinginasinglergbimageusingsynthetictrainingdata

Texture-Less Shiny Objects Grasping in a Single RGB Image Using Synthetic Training Data

Similar Items