Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition

Abstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-t...

Full description

Bibliographic Details
Main Authors: Gongzheng Chen, Zhenghong Dong, Jue Wang, Lurui Xia
Format: Article
Language:English
Published: Springer 2022-09-01
Series:Complex & Intelligent Systems
Subjects:
Online Access:https://doi.org/10.1007/s40747-022-00858-8
_version_ 1797840699172847616
author Gongzheng Chen
Zhenghong Dong
Jue Wang
Lurui Xia
author_facet Gongzheng Chen
Zhenghong Dong
Jue Wang
Lurui Xia
author_sort Gongzheng Chen
collection DOAJ
description Abstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-term and long-term spatiotemporal features of gestures. Our work is different from improving the performance of a single type of network with convnets-based models and recurrent neural network-based models or serial stacking of two heterogeneous networks, we proposed a fusion architecture that can simultaneously learn short-term and long-term spatiotemporal features of gestures, which combined convnets-based models and recurrent neural network-based models in parallel. At each stage of feature learning, the short-term and long-term spatiotemporal features of gestures are captured simultaneously, and the contribution of two heterogeneous networks to the classification results in spatial and channel axes that can be learned automatically by using the attention mechanism. The sequence and pooling operation of the channel attention module and spatial attention module are compared through experiments. And the proportion of short-term and long-term features of gestures on channel and spatial axes in each stage of feature learning is quantitatively analyzed, and the final model is determined according to the experimental results. The module can be used for end-to-end learning and the proposed method was validated on the EgoGesture, SKIG, and IsoGD datasets and got very competitive performance.
first_indexed 2024-04-09T16:19:02Z
format Article
id doaj.art-22301524a2ae4c23a847042d944f6220
institution Directory Open Access Journal
issn 2199-4536
2198-6053
language English
last_indexed 2024-04-09T16:19:02Z
publishDate 2022-09-01
publisher Springer
record_format Article
series Complex & Intelligent Systems
spelling doaj.art-22301524a2ae4c23a847042d944f62202023-04-23T11:32:18ZengSpringerComplex & Intelligent Systems2199-45362198-60532022-09-01921377139010.1007/s40747-022-00858-8Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognitionGongzheng Chen0Zhenghong Dong1Jue Wang2Lurui Xia3Graduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityAbstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-term and long-term spatiotemporal features of gestures. Our work is different from improving the performance of a single type of network with convnets-based models and recurrent neural network-based models or serial stacking of two heterogeneous networks, we proposed a fusion architecture that can simultaneously learn short-term and long-term spatiotemporal features of gestures, which combined convnets-based models and recurrent neural network-based models in parallel. At each stage of feature learning, the short-term and long-term spatiotemporal features of gestures are captured simultaneously, and the contribution of two heterogeneous networks to the classification results in spatial and channel axes that can be learned automatically by using the attention mechanism. The sequence and pooling operation of the channel attention module and spatial attention module are compared through experiments. And the proportion of short-term and long-term features of gestures on channel and spatial axes in each stage of feature learning is quantitatively analyzed, and the final model is determined according to the experimental results. The module can be used for end-to-end learning and the proposed method was validated on the EgoGesture, SKIG, and IsoGD datasets and got very competitive performance.https://doi.org/10.1007/s40747-022-00858-8Dynamic gesture recognitionAttention mechanismSpatiotemporal featuresThe human–computer interactionVideo understanding
spellingShingle Gongzheng Chen
Zhenghong Dong
Jue Wang
Lurui Xia
Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
Complex & Intelligent Systems
Dynamic gesture recognition
Attention mechanism
Spatiotemporal features
The human–computer interaction
Video understanding
title Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_full Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_fullStr Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_full_unstemmed Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_short Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_sort parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
topic Dynamic gesture recognition
Attention mechanism
Spatiotemporal features
The human–computer interaction
Video understanding
url https://doi.org/10.1007/s40747-022-00858-8
work_keys_str_mv AT gongzhengchen paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition
AT zhenghongdong paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition
AT juewang paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition
AT luruixia paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition