Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation
Estimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed meth...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-01-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/2/693 |
_version_ | 1797446780531507200 |
---|---|
author | Soon-Yong Park Chang-Min Son Won-Jae Jeong Sieun Park |
author_facet | Soon-Yong Park Chang-Min Son Won-Jae Jeong Sieun Park |
author_sort | Soon-Yong Park |
collection | DOAJ |
description | Estimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed method retrieves CAD candidates from the ShapeNet dataset and utilizes the pose-constrained 2D renderings of the candidates to find the best matching CAD model. The pose estimation pipeline consists of several steps of learned networks followed by image similarity measurements. First, from a single RGB image, the category and the object region are determined and segmented. Second, the 3-DoF rotational pose of the object is estimated by a learned pose-contrast network only using the segmented object region. Thus, 2D rendering images of CAD candidates are generated based on the rotational pose result. Finally, an image similarity measurement is performed to find the best matching CAD model and to determine the 1-DoF focal length of the camera to align the model with the object. Conventional pose estimation methods employ the 9-DoF pose parameters due to the unknown scale of both image object and CAD model. However, this study shows that only 4-DoF annotation parameters between real object and CAD model is enough to facilitates the projection of the CAD model to the RGB space for image-graphic applications such as Extended Reality. In the experiments, performance of the proposed method is analyzed by using ground truth and comparing with a triplet-loss learning method. |
first_indexed | 2024-03-09T13:46:28Z |
format | Article |
id | doaj.art-b3eee87b51e247a783b4b2214ec42851 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-09T13:46:28Z |
publishDate | 2023-01-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-b3eee87b51e247a783b4b2214ec428512023-11-30T21:00:07ZengMDPI AGApplied Sciences2076-34172023-01-0113269310.3390/app13020693Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF AnnotationSoon-Yong Park0Chang-Min Son1Won-Jae Jeong2Sieun Park3School of Electronics Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaGraduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaGraduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaGraduate School of Electronic and Electrical Engineering, Kyungpook National University, Daegu 41566, Republic of KoreaEstimating the three-dimensional (3D) pose of real objects using only a single RGB image is an interesting and difficult topic. This study proposes a new pipeline to estimate and represent the pose of an object in an RGB image only with the 4-DoF annotation to a matching CAD model. The proposed method retrieves CAD candidates from the ShapeNet dataset and utilizes the pose-constrained 2D renderings of the candidates to find the best matching CAD model. The pose estimation pipeline consists of several steps of learned networks followed by image similarity measurements. First, from a single RGB image, the category and the object region are determined and segmented. Second, the 3-DoF rotational pose of the object is estimated by a learned pose-contrast network only using the segmented object region. Thus, 2D rendering images of CAD candidates are generated based on the rotational pose result. Finally, an image similarity measurement is performed to find the best matching CAD model and to determine the 1-DoF focal length of the camera to align the model with the object. Conventional pose estimation methods employ the 9-DoF pose parameters due to the unknown scale of both image object and CAD model. However, this study shows that only 4-DoF annotation parameters between real object and CAD model is enough to facilitates the projection of the CAD model to the RGB space for image-graphic applications such as Extended Reality. In the experiments, performance of the proposed method is analyzed by using ground truth and comparing with a triplet-loss learning method.https://www.mdpi.com/2076-3417/13/2/693pose estimationCAD retrievalShapeNetimage similarity4-DoF annotationextended reality |
spellingShingle | Soon-Yong Park Chang-Min Son Won-Jae Jeong Sieun Park Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation Applied Sciences pose estimation CAD retrieval ShapeNet image similarity 4-DoF annotation extended reality |
title | Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation |
title_full | Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation |
title_fullStr | Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation |
title_full_unstemmed | Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation |
title_short | Relative Pose Estimation between Image Object and ShapeNet CAD Model for Automatic 4-DoF Annotation |
title_sort | relative pose estimation between image object and shapenet cad model for automatic 4 dof annotation |
topic | pose estimation CAD retrieval ShapeNet image similarity 4-DoF annotation extended reality |
url | https://www.mdpi.com/2076-3417/13/2/693 |
work_keys_str_mv | AT soonyongpark relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation AT changminson relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation AT wonjaejeong relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation AT sieunpark relativeposeestimationbetweenimageobjectandshapenetcadmodelforautomatic4dofannotation |