Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments

Three-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such a...

Full description

Bibliographic Details
Main Authors: Chi Xu, Jun Zhou, Wendi Cai, Yunkai Jiang, Yongbo Li, Yi Liu
Format: Article
Language:English
Published: MDPI AG 2020-11-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/20/21/6360
_version_ 1797548564082065408
author Chi Xu
Jun Zhou
Wendi Cai
Yunkai Jiang
Yongbo Li
Yi Liu
author_facet Chi Xu
Jun Zhou
Wendi Cai
Yunkai Jiang
Yongbo Li
Yi Liu
author_sort Chi Xu
collection DOAJ
description Three-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such as light changes. To tackle this problem, we propose a 3D hand detection approach which improves the robustness and accuracy by adaptively fusing the complementary features extracted from the RGB-D channels. Using the fused RGB-D feature, the 2D bounding boxes of hands are detected first, and then the 3D locations along the z-axis are estimated through a cascaded network. Furthermore, we represent a challenging RGB-D hand detection dataset collected in unconstrained environments. Different from previous works which primarily rely on either the RGB or D channel, we adaptively fuse the RGB-D channels for hand detection. Specifically, evaluation results show that the D-channel is crucial for hand detection in unconstrained environments. Our RGB-D fusion-based approach significantly improves the hand detection accuracy from 69.1 to 74.1 comparing to one of the most state-of-the-art RGB-based hand detectors. The existing RGB- or D-based methods are unstable in unseen lighting conditions: in dark conditions, the accuracy of the RGB-based method significantly drops to 48.9, and in back-light conditions, the accuracy of the D-based method dramatically drops to 28.3. Compared with these methods, our RGB-D fusion based approach is much more robust without accuracy degrading, and our detection results are 62.5 and 65.9, respectively, in these two extreme lighting conditions for accuracy.
first_indexed 2024-03-10T15:01:19Z
format Article
id doaj.art-aa6fe93573f344ababa13ac43725068c
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-10T15:01:19Z
publishDate 2020-11-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-aa6fe93573f344ababa13ac43725068c2023-11-20T20:08:57ZengMDPI AGSensors1424-82202020-11-012021636010.3390/s20216360Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained EnvironmentsChi Xu0Jun Zhou1Wendi Cai2Yunkai Jiang3Yongbo Li4Yi Liu5School of Automation, China University of Geosciences, Wuhan 430074, ChinaSchool of Automation, China University of Geosciences, Wuhan 430074, ChinaSchool of Automation, China University of Geosciences, Wuhan 430074, ChinaSchool of Automation, China University of Geosciences, Wuhan 430074, ChinaSchool of Automation, China University of Geosciences, Wuhan 430074, ChinaCRRC Zhuzhou Electric Locomotive Co., Ltd., Zhuzhou 412000, ChinaThree-dimensional hand detection from a single RGB-D image is an important technology which supports many useful applications. Practically, it is challenging to robustly detect human hands in unconstrained environments because the RGB-D channels can be affected by many uncontrollable factors, such as light changes. To tackle this problem, we propose a 3D hand detection approach which improves the robustness and accuracy by adaptively fusing the complementary features extracted from the RGB-D channels. Using the fused RGB-D feature, the 2D bounding boxes of hands are detected first, and then the 3D locations along the z-axis are estimated through a cascaded network. Furthermore, we represent a challenging RGB-D hand detection dataset collected in unconstrained environments. Different from previous works which primarily rely on either the RGB or D channel, we adaptively fuse the RGB-D channels for hand detection. Specifically, evaluation results show that the D-channel is crucial for hand detection in unconstrained environments. Our RGB-D fusion-based approach significantly improves the hand detection accuracy from 69.1 to 74.1 comparing to one of the most state-of-the-art RGB-based hand detectors. The existing RGB- or D-based methods are unstable in unseen lighting conditions: in dark conditions, the accuracy of the RGB-based method significantly drops to 48.9, and in back-light conditions, the accuracy of the D-based method dramatically drops to 28.3. Compared with these methods, our RGB-D fusion based approach is much more robust without accuracy degrading, and our detection results are 62.5 and 65.9, respectively, in these two extreme lighting conditions for accuracy.https://www.mdpi.com/1424-8220/20/21/63603D hand detectionRGB-D sensorhuman–computer interactionunseen lighting conditionadaptive RGB-D fusion
spellingShingle Chi Xu
Jun Zhou
Wendi Cai
Yunkai Jiang
Yongbo Li
Yi Liu
Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
Sensors
3D hand detection
RGB-D sensor
human–computer interaction
unseen lighting condition
adaptive RGB-D fusion
title Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
title_full Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
title_fullStr Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
title_full_unstemmed Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
title_short Robust 3D Hand Detection from a Single RGB-D Image in Unconstrained Environments
title_sort robust 3d hand detection from a single rgb d image in unconstrained environments
topic 3D hand detection
RGB-D sensor
human–computer interaction
unseen lighting condition
adaptive RGB-D fusion
url https://www.mdpi.com/1424-8220/20/21/6360
work_keys_str_mv AT chixu robust3dhanddetectionfromasinglergbdimageinunconstrainedenvironments
AT junzhou robust3dhanddetectionfromasinglergbdimageinunconstrainedenvironments
AT wendicai robust3dhanddetectionfromasinglergbdimageinunconstrainedenvironments
AT yunkaijiang robust3dhanddetectionfromasinglergbdimageinunconstrainedenvironments
AT yongboli robust3dhanddetectionfromasinglergbdimageinunconstrainedenvironments
AT yiliu robust3dhanddetectionfromasinglergbdimageinunconstrainedenvironments