Camera distance helps 3D hand pose estimated from a single RGB image

Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-de...

Full description

Bibliographic Details
Main Authors:	Yuan Cui, Moran Li, Yuan Gao, Changxin Gao, Fan Wu, Hao Wen, Jiwei Li, Nong Sang
Format:	Article
Language:	English
Published:	Elsevier 2023-05-01
Series:	Graphical Models
Subjects:	Hourglass GCN Hand pose estimation Root-relative joints Coordinates attentions
Online Access:	http://www.sciencedirect.com/science/article/pii/S1524070323000097

_version_	1797788474443563008
author	Yuan Cui Moran Li Yuan Gao Changxin Gao Fan Wu Hao Wen Jiwei Li Nong Sang
author_facet	Yuan Cui Moran Li Yuan Gao Changxin Gao Fan Wu Hao Wen Jiwei Li Nong Sang
author_sort	Yuan Cui
collection	DOAJ
description	Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.
first_indexed	2024-03-13T01:37:05Z
format	Article
id	doaj.art-9945059f7c5d4d20908b91e50650d70a
institution	Directory Open Access Journal
issn	1524-0703
language	English
last_indexed	2024-03-13T01:37:05Z
publishDate	2023-05-01
publisher	Elsevier
record_format	Article
series	Graphical Models
spelling	doaj.art-9945059f7c5d4d20908b91e50650d70a2023-07-04T05:09:41ZengElsevierGraphical Models1524-07032023-05-01127101179Camera distance helps 3D hand pose estimated from a single RGB imageYuan Cui0Moran Li1Yuan Gao2Changxin Gao3Fan Wu4Hao Wen5Jiwei Li6Nong Sang7Key Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; CloudWalk Technology, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; CloudWalk Technology, ChinaElectronic Information School, Wuhan University, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, ChinaCloudWalk Technology, ChinaCloudWalk Technology, ChinaCloudWalk Technology, ChinaKey Laboratory of Image Processing and Intelligent Control, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan 430074, China; Corresponding author.Most existing methods for RGB hand pose estimation use root-relative 3D coordinates for supervision. However, such supervision neglects the distance between the camera and the object (i.e., the hand). The camera distance is especially important under a perspective camera, which controls the depth-dependent scaling of the perspective projection. As a result, the same hand pose, with different camera distances can be projected into different 2D shapes by the same perspective camera. Neglecting such important information results in ambiguities in recovering 3D poses from 2D images. In this article, we propose a camera projection learning module (CPLM) that uses the scale factor contained in the camera distance to associate 3D hand pose with 2D UV coordinates, which facilities to further optimize the accuracy of the estimated hand joints. Specifically, following the previous work, we use a two-stage RGB-to-2D and 2D-to-3D method to estimate 3D hand pose and embed a graph convolutional network in the second stage to leverage the information contained in the complex non-Euclidean structure of 2D hand joints. Experimental results demonstrate that our proposed method surpasses state-of-the-art methods on the benchmark dataset RHD and obtains competitive results on the STB and D+O datasets.http://www.sciencedirect.com/science/article/pii/S1524070323000097HourglassGCNHand pose estimationRoot-relative jointsCoordinates attentions
spellingShingle	Yuan Cui Moran Li Yuan Gao Changxin Gao Fan Wu Hao Wen Jiwei Li Nong Sang Camera distance helps 3D hand pose estimated from a single RGB image Graphical Models Hourglass GCN Hand pose estimation Root-relative joints Coordinates attentions
title	Camera distance helps 3D hand pose estimated from a single RGB image
title_full	Camera distance helps 3D hand pose estimated from a single RGB image
title_fullStr	Camera distance helps 3D hand pose estimated from a single RGB image
title_full_unstemmed	Camera distance helps 3D hand pose estimated from a single RGB image
title_short	Camera distance helps 3D hand pose estimated from a single RGB image
title_sort	camera distance helps 3d hand pose estimated from a single rgb image
topic	Hourglass GCN Hand pose estimation Root-relative joints Coordinates attentions
url	http://www.sciencedirect.com/science/article/pii/S1524070323000097
work_keys_str_mv	AT yuancui cameradistancehelps3dhandposeestimatedfromasinglergbimage AT moranli cameradistancehelps3dhandposeestimatedfromasinglergbimage AT yuangao cameradistancehelps3dhandposeestimatedfromasinglergbimage AT changxingao cameradistancehelps3dhandposeestimatedfromasinglergbimage AT fanwu cameradistancehelps3dhandposeestimatedfromasinglergbimage AT haowen cameradistancehelps3dhandposeestimatedfromasinglergbimage AT jiweili cameradistancehelps3dhandposeestimatedfromasinglergbimage AT nongsang cameradistancehelps3dhandposeestimatedfromasinglergbimage

Camera distance helps 3D hand pose estimated from a single RGB image

Similar Items