Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation

Estimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed meth...

Full description

Bibliographic Details
Main Authors: Soon-Yong Park, Chang-Min Son, Won-Jae Jeong, Sieun Park
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/2/693
_version_ 1797446780531507200
author Soon-Yong Park
Chang-Min Son
Won-Jae Jeong
Sieun Park
author_facet Soon-Yong Park
Chang-Min Son
Won-Jae Jeong
Sieun Park
author_sort Soon-Yong Park
collection DOAJ
description Estimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed method retrieves CAD candidates from the ShapeNet dataset and utilizes the pose-constrained 2D renderings of the candidates to find the best matching CAD model. The pose estimation pipeline consists of several steps of learned networks followed by image similarity measurements. First, from a single RGB image, the category and the object region are determined and segmented. Second, the 3-DoF rotational pose of the object is estimated by a learned pose-contrast network only using the segmented object region. Thus, 2D rendering images of CAD candidates are generated based on the rotational pose result. Finally, an image similarity measurement is performed to find the best matching CAD model and to determine the 1-DoF focal length of the camera to align the model with the object. Conventional pose estimation methods employ the 9-DoF pose parameters due to the unknown scale of both image object and CAD model. However, this study shows that only 4-DoF annotation parameters between real object and CAD model is enough to facilitates the projection of the CAD model to the RGB space for image-graphic applications such as Extended Reality. In the experiments, performance of the proposed method is analyzed by using ground truth and comparing with a triplet-loss learning method.
first_indexed 2024-03-09T13:46:28Z
format Article
id doaj.art-b3eee87b51e247a783b4b2214ec42851
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-09T13:46:28Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-b3eee87b51e247a783b4b2214ec428512023-11-30T21:00:07ZengMDPI AGApplied Sciences2076-34172023-01-0113269310.3390/app13020693Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF AnnotationSoon-Yong Park0Chang-Min Son1Won-Jae Jeong2Sieun Park3School of Electronics Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaGraduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaGraduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaGraduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaEstimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed method retrieves CAD candidates from the ShapeNet dataset and utilizes the pose-constrained 2D renderings of the candidates to find the best matching CAD model. The pose estimation pipeline consists of several steps of learned networks followed by image similarity measurements. First, from a single RGB image, the category and the object region are determined and segmented. Second, the 3-DoF rotational pose of the object is estimated by a learned pose-contrast network only using the segmented object region. Thus, 2D rendering images of CAD candidates are generated based on the rotational pose result. Finally, an image similarity measurement is performed to find the best matching CAD model and to determine the 1-DoF focal length of the camera to align the model with the object. Conventional pose estimation methods employ the 9-DoF pose parameters due to the unknown scale of both image object and CAD model. However, this study shows that only 4-DoF annotation parameters between real object and CAD model is enough to facilitates the projection of the CAD model to the RGB space for image-graphic applications such as Extended Reality. In the experiments, performance of the proposed method is analyzed by using ground truth and comparing with a triplet-loss learning method.https://www.mdpi.com/2076-3417/13/2/693pose estimationCAD retrievalShapeNetimage similarity4-DoF annotationextended reality
spellingShingle Soon-Yong Park
Chang-Min Son
Won-Jae Jeong
Sieun Park
Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
Applied Sciences
pose estimation
CAD retrieval
ShapeNet
image similarity
4-DoF annotation
extended reality
title Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
title_full Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
title_fullStr Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
title_full_unstemmed Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
title_short Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
title_sort relative pose estimation between image object and shapenet cad model for automatic 4 dof annotation
topic pose estimation
CAD retrieval
ShapeNet
image similarity
4-DoF annotation
extended reality
url https://www.mdpi.com/2076-3417/13/2/693
work_keys_str_mv AT soonyongpark relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation
AT changminson relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation
AT wonjaejeong relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation
AT sieunpark relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation