Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators

Robotic manipulation requires object pose knowledge for the objects of interest. In order to perform typical household chores, a robot needs to be able to estimate 6D poses for objects such as water glasses or salad bowls. This is especially difficult for glass objects, as for these, depth data are...

Full description

Bibliographic Details
Main Authors:	Maira Weidenbach, Tim Laue, Udo Frese
Format:	Article
Language:	English
Published:	MDPI AG 2024-01-01
Series:	Sensors
Subjects:	neural networks training data transparent objects bounding box segmentation pose estimation
Online Access:	https://www.mdpi.com/1424-8220/24/2/432

_version_	1797339432614887424
author	Maira Weidenbach Tim Laue Udo Frese
author_facet	Maira Weidenbach Tim Laue Udo Frese
author_sort	Maira Weidenbach
collection	DOAJ
description	Robotic manipulation requires object pose knowledge for the objects of interest. In order to perform typical household chores, a robot needs to be able to estimate 6D poses for objects such as water glasses or salad bowls. This is especially difficult for glass objects, as for these, depth data are mostly disturbed, and in RGB images, occluded objects are still visible. Thus, in this paper, we propose to redefine the ground-truth for training RGB-based pose estimators in two ways: (a) we apply a transparency-aware multisegmentation, in which an image pixel can belong to more than one object, and (b) we use transparency-aware bounding boxes, which always enclose whole objects, even if parts of an object are formally occluded by another object. The latter approach ensures that the size and scale of an object remain more consistent across different images. We train our pose estimator, which was originally designed for opaque objects, with three different ground-truth types on the ClearPose dataset. Just by changing the training data to our transparency-aware segmentation, with no additional glass-specific feature changes in the estimator, the ADD-S AUC value increases by 4.3%. Such a multisegmentation can be created for every dataset that provides a 3D model of the object and its ground-truth pose.
first_indexed	2024-03-08T09:47:24Z
format	Article
id	doaj.art-b605aa902da44bc0a2eeb2973cab6076
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-08T09:47:24Z
publishDate	2024-01-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-b605aa902da44bc0a2eeb2973cab60762024-01-29T14:14:39ZengMDPI AGSensors1424-82202024-01-0124243210.3390/s24020432Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose EstimatorsMaira Weidenbach0Tim Laue1Udo Frese2Faculty of Mathematics and Computer Science, University of Bremen, 28359 Bremen, GermanyFaculty of Mathematics and Computer Science, University of Bremen, 28359 Bremen, GermanyFaculty of Mathematics and Computer Science, University of Bremen, 28359 Bremen, GermanyRobotic manipulation requires object pose knowledge for the objects of interest. In order to perform typical household chores, a robot needs to be able to estimate 6D poses for objects such as water glasses or salad bowls. This is especially difficult for glass objects, as for these, depth data are mostly disturbed, and in RGB images, occluded objects are still visible. Thus, in this paper, we propose to redefine the ground-truth for training RGB-based pose estimators in two ways: (a) we apply a transparency-aware multisegmentation, in which an image pixel can belong to more than one object, and (b) we use transparency-aware bounding boxes, which always enclose whole objects, even if parts of an object are formally occluded by another object. The latter approach ensures that the size and scale of an object remain more consistent across different images. We train our pose estimator, which was originally designed for opaque objects, with three different ground-truth types on the ClearPose dataset. Just by changing the training data to our transparency-aware segmentation, with no additional glass-specific feature changes in the estimator, the ADD-S AUC value increases by 4.3%. Such a multisegmentation can be created for every dataset that provides a 3D model of the object and its ground-truth pose.https://www.mdpi.com/1424-8220/24/2/432neural networkstraining datatransparent objectsbounding boxsegmentationpose estimation
spellingShingle	Maira Weidenbach Tim Laue Udo Frese Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators Sensors neural networks training data transparent objects bounding box segmentation pose estimation
title	Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators
title_full	Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators
title_fullStr	Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators
title_full_unstemmed	Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators
title_short	Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators
title_sort	transparency aware segmentation of glass objects to train rgb based pose estimators
topic	neural networks training data transparent objects bounding box segmentation pose estimation
url	https://www.mdpi.com/1424-8220/24/2/432
work_keys_str_mv	AT mairaweidenbach transparencyawaresegmentationofglassobjectstotrainrgbbasedposeestimators AT timlaue transparencyawaresegmentationofglassobjectstotrainrgbbasedposeestimators AT udofrese transparencyawaresegmentationofglassobjectstotrainrgbbasedposeestimators

Transparency-Aware Segmentation of Glass Objects to Train RGB-Based Pose Estimators

Similar Items