Estimation of 6D Object Pose Using a 2D Bounding Box

This paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object′s three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the...

Full description

Bibliographic Details
Main Authors:	Yong Hong, Jin Liu, Zahid Jahangir, Sheng He, Qing Zhang
Format:	Article
Language:	English
Published:	MDPI AG 2021-04-01
Series:	Sensors
Subjects:	6D pose estimation quaternion Bounding Box Equation LineMod
Online Access:	https://www.mdpi.com/1424-8220/21/9/2939

_version_	1797536748054511616
author	Yong Hong Jin Liu Zahid Jahangir Sheng He Qing Zhang
author_facet	Yong Hong Jin Liu Zahid Jahangir Sheng He Qing Zhang
author_sort	Yong Hong
collection	DOAJ
description	This paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object′s three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the L2 loss between them is very large. Therefore, we define a new quaternion pose loss function to solve this problem. Based on this, we designed a new convolutional neural network named Q-Net to estimate an object’s pose. Considering that the quaternion′s output is a unit vector, a normalization layer is added in Q-Net to hold the output of pose on a four-dimensional unit sphere. We propose a new algorithm, called the Bounding Box Equation, to obtain 3D translation quickly and effectively from 2D bounding boxes. The algorithm uses an entirely new way of assessing the 3D rotation (R) and 3D translation rotation (t) in only one RGB image. This method can upgrade any traditional 2D-box prediction algorithm to a 3D prediction model. We evaluated our model using the LineMod dataset, and experiments have shown that our methodology is more acceptable and efficient in terms of L2 loss and computational time.
first_indexed	2024-03-10T12:05:11Z
format	Article
id	doaj.art-e4646fa4145c4590aa784ba387e531eb
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-10T12:05:11Z
publishDate	2021-04-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-e4646fa4145c4590aa784ba387e531eb2023-11-21T16:40:06ZengMDPI AGSensors1424-82202021-04-01219293910.3390/s21092939Estimation of 6D Object Pose Using a 2D Bounding BoxYong Hong0Jin Liu1Zahid Jahangir2Sheng He3Qing Zhang4State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, ChinaState Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan 430079, ChinaCollege of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, ChinaThis paper provides an efficient way of addressing the problem of detecting or estimating the 6-Dimensional (6D) pose of objects from an RGB image. A quaternion is used to define an object′s three-dimensional pose, but the pose represented by q and the pose represented by -q are equivalent, and the L2 loss between them is very large. Therefore, we define a new quaternion pose loss function to solve this problem. Based on this, we designed a new convolutional neural network named Q-Net to estimate an object’s pose. Considering that the quaternion′s output is a unit vector, a normalization layer is added in Q-Net to hold the output of pose on a four-dimensional unit sphere. We propose a new algorithm, called the Bounding Box Equation, to obtain 3D translation quickly and effectively from 2D bounding boxes. The algorithm uses an entirely new way of assessing the 3D rotation (R) and 3D translation rotation (t) in only one RGB image. This method can upgrade any traditional 2D-box prediction algorithm to a 3D prediction model. We evaluated our model using the LineMod dataset, and experiments have shown that our methodology is more acceptable and efficient in terms of L2 loss and computational time.https://www.mdpi.com/1424-8220/21/9/29396D pose estimationquaternionBounding Box EquationLineMod
spellingShingle	Yong Hong Jin Liu Zahid Jahangir Sheng He Qing Zhang Estimation of 6D Object Pose Using a 2D Bounding Box Sensors 6D pose estimation quaternion Bounding Box Equation LineMod
title	Estimation of 6D Object Pose Using a 2D Bounding Box
title_full	Estimation of 6D Object Pose Using a 2D Bounding Box
title_fullStr	Estimation of 6D Object Pose Using a 2D Bounding Box
title_full_unstemmed	Estimation of 6D Object Pose Using a 2D Bounding Box
title_short	Estimation of 6D Object Pose Using a 2D Bounding Box
title_sort	estimation of 6d object pose using a 2d bounding box
topic	6D pose estimation quaternion Bounding Box Equation LineMod
url	https://www.mdpi.com/1424-8220/21/9/2939
work_keys_str_mv	AT yonghong estimationof6dobjectposeusinga2dboundingbox AT jinliu estimationof6dobjectposeusinga2dboundingbox AT zahidjahangir estimationof6dobjectposeusinga2dboundingbox AT shenghe estimationof6dobjectposeusinga2dboundingbox AT qingzhang estimationof6dobjectposeusinga2dboundingbox

Estimation of 6D Object Pose Using a 2D Bounding Box

Similar Items