Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images

Abstract With the rapid development of sensor technology and artificial intelligence, the video gesture recognition technology under the background of big data makes human‐computer interaction more natural and flexible, bringing richer interactive experience to teaching, on‐board control, electronic...

Full description

Bibliographic Details
Main Authors: Ying Sun, Yaoqing Weng, Bowen Luo, Gongfa Li, Bo Tao, Du Jiang, Disi Chen
Format: Article
Language:English
Published: Wiley 2023-03-01
Series:IET Image Processing
Subjects:
Online Access:https://doi.org/10.1049/ipr2.12712
_version_ 1811159613077192704
author Ying Sun
Yaoqing Weng
Bowen Luo
Gongfa Li
Bo Tao
Du Jiang
Disi Chen
author_facet Ying Sun
Yaoqing Weng
Bowen Luo
Gongfa Li
Bo Tao
Du Jiang
Disi Chen
author_sort Ying Sun
collection DOAJ
description Abstract With the rapid development of sensor technology and artificial intelligence, the video gesture recognition technology under the background of big data makes human‐computer interaction more natural and flexible, bringing richer interactive experience to teaching, on‐board control, electronic games, etc. In order to perform robust recognition under the conditions of illumination change, background clutter, rapid movement, partial occlusion, an algorithm based on multi‐level feature fusion of two‐stream convolutional neural network is proposed, which includes three main steps. Firstly, the Kinect sensor obtains RGB‐D images to establish a gesture database. At the same time, data enhancement is performed on training and test sets. Then, a model of multi‐level feature fusion of two‐stream convolutional neural network is established and trained. Experiments result show that the proposed network model can robustly track and recognize gestures, and compared with the single‐channel model, the average detection accuracy is improved by 1.08%, and mean average precision (mAP) is improved by 3.56%. The average recognition rate of gestures under occlusion and different light intensity was 93.98%. Finally, in the ASL dataset, LaRED dataset, and 1‐miohand dataset, recognition accuracy shows satisfactory performances compared to the other method.
first_indexed 2024-04-10T05:44:13Z
format Article
id doaj.art-608e8007b2f14b19abd5b9a12a766b88
institution Directory Open Access Journal
issn 1751-9659
1751-9667
language English
last_indexed 2024-04-10T05:44:13Z
publishDate 2023-03-01
publisher Wiley
record_format Article
series IET Image Processing
spelling doaj.art-608e8007b2f14b19abd5b9a12a766b882023-03-06T04:27:53ZengWileyIET Image Processing1751-96591751-96672023-03-011741280129010.1049/ipr2.12712Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D imagesYing Sun0Yaoqing Weng1Bowen Luo2Gongfa Li3Bo Tao4Du Jiang5Disi Chen6Key Laboratory of Metallurgical Equipment and Control Technology Ministry of Education Wuhan University of Science and Technology Wuhan ChinaResearch Center of Biologic Manipulator and Intelligent Measurement and Control Wuhan University of Science and Technology Wuhan ChinaKey Laboratory of Metallurgical Equipment and Control Technology Ministry of Education Wuhan University of Science and Technology Wuhan ChinaKey Laboratory of Metallurgical Equipment and Control Technology Ministry of Education Wuhan University of Science and Technology Wuhan ChinaHubei Key Laboratory of Mechanical Transmission and Manufacturing Engineering Wuhan University of Science and Technology Wuhan ChinaResearch Center of Biologic Manipulator and Intelligent Measurement and Control Wuhan University of Science and Technology Wuhan ChinaSchool of Computing University of Portsmouth Portsmouth UKAbstract With the rapid development of sensor technology and artificial intelligence, the video gesture recognition technology under the background of big data makes human‐computer interaction more natural and flexible, bringing richer interactive experience to teaching, on‐board control, electronic games, etc. In order to perform robust recognition under the conditions of illumination change, background clutter, rapid movement, partial occlusion, an algorithm based on multi‐level feature fusion of two‐stream convolutional neural network is proposed, which includes three main steps. Firstly, the Kinect sensor obtains RGB‐D images to establish a gesture database. At the same time, data enhancement is performed on training and test sets. Then, a model of multi‐level feature fusion of two‐stream convolutional neural network is established and trained. Experiments result show that the proposed network model can robustly track and recognize gestures, and compared with the single‐channel model, the average detection accuracy is improved by 1.08%, and mean average precision (mAP) is improved by 3.56%. The average recognition rate of gestures under occlusion and different light intensity was 93.98%. Finally, in the ASL dataset, LaRED dataset, and 1‐miohand dataset, recognition accuracy shows satisfactory performances compared to the other method.https://doi.org/10.1049/ipr2.12712image processingneural nets
spellingShingle Ying Sun
Yaoqing Weng
Bowen Luo
Gongfa Li
Bo Tao
Du Jiang
Disi Chen
Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images
IET Image Processing
image processing
neural nets
title Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images
title_full Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images
title_fullStr Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images
title_full_unstemmed Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images
title_short Gesture recognition algorithm based on multi‐scale feature fusion in RGB‐D images
title_sort gesture recognition algorithm based on multi scale feature fusion in rgb d images
topic image processing
neural nets
url https://doi.org/10.1049/ipr2.12712
work_keys_str_mv AT yingsun gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages
AT yaoqingweng gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages
AT bowenluo gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages
AT gongfali gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages
AT botao gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages
AT dujiang gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages
AT disichen gesturerecognitionalgorithmbasedonmultiscalefeaturefusioninrgbdimages