Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition

Abstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-t...

Full description

Bibliographic Details
Main Authors:	Gongzheng Chen, Zhenghong Dong, Jue Wang, Lurui Xia
Format:	Article
Language:	English
Published:	Springer 2022-09-01
Series:	Complex & Intelligent Systems
Subjects:	Dynamic gesture recognition Attention mechanism Spatiotemporal features The human–computer interaction Video understanding
Online Access:	https://doi.org/10.1007/s40747-022-00858-8

_version_	1797840699172847616
author	Gongzheng Chen Zhenghong Dong Jue Wang Lurui Xia
author_facet	Gongzheng Chen Zhenghong Dong Jue Wang Lurui Xia
author_sort	Gongzheng Chen
collection	DOAJ
description	Abstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-term and long-term spatiotemporal features of gestures. Our work is different from improving the performance of a single type of network with convnets-based models and recurrent neural network-based models or serial stacking of two heterogeneous networks, we proposed a fusion architecture that can simultaneously learn short-term and long-term spatiotemporal features of gestures, which combined convnets-based models and recurrent neural network-based models in parallel. At each stage of feature learning, the short-term and long-term spatiotemporal features of gestures are captured simultaneously, and the contribution of two heterogeneous networks to the classification results in spatial and channel axes that can be learned automatically by using the attention mechanism. The sequence and pooling operation of the channel attention module and spatial attention module are compared through experiments. And the proportion of short-term and long-term features of gestures on channel and spatial axes in each stage of feature learning is quantitatively analyzed, and the final model is determined according to the experimental results. The module can be used for end-to-end learning and the proposed method was validated on the EgoGesture, SKIG, and IsoGD datasets and got very competitive performance.
first_indexed	2024-04-09T16:19:02Z
format	Article
id	doaj.art-22301524a2ae4c23a847042d944f6220
institution	Directory Open Access Journal
issn	2199-4536 2198-6053
language	English
last_indexed	2024-04-09T16:19:02Z
publishDate	2022-09-01
publisher	Springer
record_format	Article
series	Complex & Intelligent Systems
spelling	doaj.art-22301524a2ae4c23a847042d944f62202023-04-23T11:32:18ZengSpringerComplex & Intelligent Systems2199-45362198-60532022-09-01921377139010.1007/s40747-022-00858-8Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognitionGongzheng Chen0Zhenghong Dong1Jue Wang2Lurui Xia3Graduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityAbstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-term and long-term spatiotemporal features of gestures. Our work is different from improving the performance of a single type of network with convnets-based models and recurrent neural network-based models or serial stacking of two heterogeneous networks, we proposed a fusion architecture that can simultaneously learn short-term and long-term spatiotemporal features of gestures, which combined convnets-based models and recurrent neural network-based models in parallel. At each stage of feature learning, the short-term and long-term spatiotemporal features of gestures are captured simultaneously, and the contribution of two heterogeneous networks to the classification results in spatial and channel axes that can be learned automatically by using the attention mechanism. The sequence and pooling operation of the channel attention module and spatial attention module are compared through experiments. And the proportion of short-term and long-term features of gestures on channel and spatial axes in each stage of feature learning is quantitatively analyzed, and the final model is determined according to the experimental results. The module can be used for end-to-end learning and the proposed method was validated on the EgoGesture, SKIG, and IsoGD datasets and got very competitive performance.https://doi.org/10.1007/s40747-022-00858-8Dynamic gesture recognitionAttention mechanismSpatiotemporal featuresThe human–computer interactionVideo understanding
spellingShingle	Gongzheng Chen Zhenghong Dong Jue Wang Lurui Xia Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition Complex & Intelligent Systems Dynamic gesture recognition Attention mechanism Spatiotemporal features The human–computer interaction Video understanding
title	Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_full	Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_fullStr	Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_full_unstemmed	Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_short	Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
title_sort	parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
topic	Dynamic gesture recognition Attention mechanism Spatiotemporal features The human–computer interaction Video understanding
url	https://doi.org/10.1007/s40747-022-00858-8
work_keys_str_mv	AT gongzhengchen paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition AT zhenghongdong paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition AT juewang paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition AT luruixia paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition

Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition

Similar Items