Camera distance helps 3D hand pose estimated from a single RGB image
Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-de...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-05-01
|
Series: | Graphical Models |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S1524070323000097 |
_version_ | 1797788474443563008 |
---|---|
author | Yuan Cui Moran Li Yuan Gao Changxin Gao Fan Wu Hao Wen Jiwei Li Nong Sang |
author_facet | Yuan Cui Moran Li Yuan Gao Changxin Gao Fan Wu Hao Wen Jiwei Li Nong Sang |
author_sort | Yuan Cui |
collection | DOAJ |
description | Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets. |
first_indexed | 2024-03-13T01:37:05Z |
format | Article |
id | doaj.art-9945059f7c5d4d20908b91e50650d70a |
institution | Directory Open Access Journal |
issn | 1524-0703 |
language | English |
last_indexed | 2024-03-13T01:37:05Z |
publishDate | 2023-05-01 |
publisher | Elsevier |
record_format | Article |
series | Graphical Models |
spelling | doaj.art-9945059f7c5d4d20908b91e50650d70a2023-07-04T05:09:41ZengElsevierGraphical Models1524-07032023-05-01127101179Camera distance helps 3D hand pose estimated from a single RGB imageYuan Cui0Moran Li1Yuan Gao2Changxin Gao3Fan Wu4Hao Wen5Jiwei Li6Nong Sang7Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; CloudWalk Technology, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; CloudWalk Technology, ChinaElectronic Information School, Wuhan University, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, ChinaCloudWalk Technology, ChinaCloudWalk Technology, ChinaCloudWalk Technology, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; Corresponding author.Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.http://www.sciencedirect.com/science/article/pii/S1524070323000097HourglassGCNHand pose estimationRoot-relative jointsCoordinates attentions |
spellingShingle | Yuan Cui Moran Li Yuan Gao Changxin Gao Fan Wu Hao Wen Jiwei Li Nong Sang Camera distance helps 3D hand pose estimated from a single RGB image Graphical Models Hourglass GCN Hand pose estimation Root-relative joints Coordinates attentions |
title | Camera distance helps 3D hand pose estimated from a single RGB image |
title_full | Camera distance helps 3D hand pose estimated from a single RGB image |
title_fullStr | Camera distance helps 3D hand pose estimated from a single RGB image |
title_full_unstemmed | Camera distance helps 3D hand pose estimated from a single RGB image |
title_short | Camera distance helps 3D hand pose estimated from a single RGB image |
title_sort | camera distance helps 3d hand pose estimated from a single rgb image |
topic | Hourglass GCN Hand pose estimation Root-relative joints Coordinates attentions |
url | http://www.sciencedirect.com/science/article/pii/S1524070323000097 |
work_keys_str_mv | AT yuancui cameradistancehelps3dhandposeestimatedfromasinglergbimage AT moranli cameradistancehelps3dhandposeestimatedfromasinglergbimage AT yuangao cameradistancehelps3dhandposeestimatedfromasinglergbimage AT changxingao cameradistancehelps3dhandposeestimatedfromasinglergbimage AT fanwu cameradistancehelps3dhandposeestimatedfromasinglergbimage AT haowen cameradistancehelps3dhandposeestimatedfromasinglergbimage AT jiweili cameradistancehelps3dhandposeestimatedfromasinglergbimage AT nongsang cameradistancehelps3dhandposeestimatedfromasinglergbimage |