Camera distance helps 3D hand pose estimated from a single RGB image

Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-de...

Full description

Bibliographic Details
Main Authors: Yuan Cui, Moran Li, Yuan Gao, Changxin Gao, Fan Wu, Hao Wen, Jiwei Li, Nong Sang
Format: Article
Language:English
Published: Elsevier 2023-05-01
Series:Graphical Models
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S1524070323000097
_version_ 1797788474443563008
author Yuan Cui
Moran Li
Yuan Gao
Changxin Gao
Fan Wu
Hao Wen
Jiwei Li
Nong Sang
author_facet Yuan Cui
Moran Li
Yuan Gao
Changxin Gao
Fan Wu
Hao Wen
Jiwei Li
Nong Sang
author_sort Yuan Cui
collection DOAJ
description Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.
first_indexed 2024-03-13T01:37:05Z
format Article
id doaj.art-9945059f7c5d4d20908b91e50650d70a
institution Directory Open Access Journal
issn 1524-0703
language English
last_indexed 2024-03-13T01:37:05Z
publishDate 2023-05-01
publisher Elsevier
record_format Article
series Graphical Models
spelling doaj.art-9945059f7c5d4d20908b91e50650d70a2023-07-04T05:09:41ZengElsevierGraphical Models1524-07032023-05-01127101179Camera distance helps 3D hand pose estimated from a single RGB imageYuan Cui0Moran Li1Yuan Gao2Changxin Gao3Fan Wu4Hao Wen5Jiwei Li6Nong Sang7Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; CloudWalk Technology, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; CloudWalk Technology, ChinaElectronic Information School, Wuhan University, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, ChinaCloudWalk Technology, ChinaCloudWalk Technology, ChinaCloudWalk Technology, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; Corresponding author.Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.http://www.sciencedirect.com/science/article/pii/S1524070323000097HourglassGCNHand pose estimationRoot-relative jointsCoordinates attentions
spellingShingle Yuan Cui
Moran Li
Yuan Gao
Changxin Gao
Fan Wu
Hao Wen
Jiwei Li
Nong Sang
Camera distance helps 3D hand pose estimated from a single RGB image
Graphical Models
Hourglass
GCN
Hand pose estimation
Root-relative joints
Coordinates attentions
title Camera distance helps 3D hand pose estimated from a single RGB image
title_full Camera distance helps 3D hand pose estimated from a single RGB image
title_fullStr Camera distance helps 3D hand pose estimated from a single RGB image
title_full_unstemmed Camera distance helps 3D hand pose estimated from a single RGB image
title_short Camera distance helps 3D hand pose estimated from a single RGB image
title_sort camera distance helps 3d hand pose estimated from a single rgb image
topic Hourglass
GCN
Hand pose estimation
Root-relative joints
Coordinates attentions
url http://www.sciencedirect.com/science/article/pii/S1524070323000097
work_keys_str_mv AT yuancui cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT moranli cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT yuangao cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT changxingao cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT fanwu cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT haowen cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT jiweili cameradistancehelps3dhandposeestimatedfromasinglergbimage
AT nongsang cameradistancehelps3dhandposeestimatedfromasinglergbimage