Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition
Abstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-t...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Springer
2022-09-01
|
Series: | Complex & Intelligent Systems |
Subjects: | |
Online Access: | https://doi.org/10.1007/s40747-022-00858-8 |
_version_ | 1797840699172847616 |
---|---|
author | Gongzheng Chen Zhenghong Dong Jue Wang Lurui Xia |
author_facet | Gongzheng Chen Zhenghong Dong Jue Wang Lurui Xia |
author_sort | Gongzheng Chen |
collection | DOAJ |
description | Abstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-term and long-term spatiotemporal features of gestures. Our work is different from improving the performance of a single type of network with convnets-based models and recurrent neural network-based models or serial stacking of two heterogeneous networks, we proposed a fusion architecture that can simultaneously learn short-term and long-term spatiotemporal features of gestures, which combined convnets-based models and recurrent neural network-based models in parallel. At each stage of feature learning, the short-term and long-term spatiotemporal features of gestures are captured simultaneously, and the contribution of two heterogeneous networks to the classification results in spatial and channel axes that can be learned automatically by using the attention mechanism. The sequence and pooling operation of the channel attention module and spatial attention module are compared through experiments. And the proportion of short-term and long-term features of gestures on channel and spatial axes in each stage of feature learning is quantitatively analyzed, and the final model is determined according to the experimental results. The module can be used for end-to-end learning and the proposed method was validated on the EgoGesture, SKIG, and IsoGD datasets and got very competitive performance. |
first_indexed | 2024-04-09T16:19:02Z |
format | Article |
id | doaj.art-22301524a2ae4c23a847042d944f6220 |
institution | Directory Open Access Journal |
issn | 2199-4536 2198-6053 |
language | English |
last_indexed | 2024-04-09T16:19:02Z |
publishDate | 2022-09-01 |
publisher | Springer |
record_format | Article |
series | Complex & Intelligent Systems |
spelling | doaj.art-22301524a2ae4c23a847042d944f62202023-04-23T11:32:18ZengSpringerComplex & Intelligent Systems2199-45362198-60532022-09-01921377139010.1007/s40747-022-00858-8Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognitionGongzheng Chen0Zhenghong Dong1Jue Wang2Lurui Xia3Graduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityGraduate School of Aerospace Engineering UniversityAbstract Dynamic gesture recognition has become a new type of interaction to meet the needs of daily interaction. It is the most natural, easy to operate, and intuitive, so it has a wide range of applications. The accuracy of gesture recognition depends on the ability to accurately learn the short-term and long-term spatiotemporal features of gestures. Our work is different from improving the performance of a single type of network with convnets-based models and recurrent neural network-based models or serial stacking of two heterogeneous networks, we proposed a fusion architecture that can simultaneously learn short-term and long-term spatiotemporal features of gestures, which combined convnets-based models and recurrent neural network-based models in parallel. At each stage of feature learning, the short-term and long-term spatiotemporal features of gestures are captured simultaneously, and the contribution of two heterogeneous networks to the classification results in spatial and channel axes that can be learned automatically by using the attention mechanism. The sequence and pooling operation of the channel attention module and spatial attention module are compared through experiments. And the proportion of short-term and long-term features of gestures on channel and spatial axes in each stage of feature learning is quantitatively analyzed, and the final model is determined according to the experimental results. The module can be used for end-to-end learning and the proposed method was validated on the EgoGesture, SKIG, and IsoGD datasets and got very competitive performance.https://doi.org/10.1007/s40747-022-00858-8Dynamic gesture recognitionAttention mechanismSpatiotemporal featuresThe human–computer interactionVideo understanding |
spellingShingle | Gongzheng Chen Zhenghong Dong Jue Wang Lurui Xia Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition Complex & Intelligent Systems Dynamic gesture recognition Attention mechanism Spatiotemporal features The human–computer interaction Video understanding |
title | Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition |
title_full | Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition |
title_fullStr | Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition |
title_full_unstemmed | Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition |
title_short | Parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition |
title_sort | parallel temporal feature selection based on improved attention mechanism for dynamic gesture recognition |
topic | Dynamic gesture recognition Attention mechanism Spatiotemporal features The human–computer interaction Video understanding |
url | https://doi.org/10.1007/s40747-022-00858-8 |
work_keys_str_mv | AT gongzhengchen paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition AT zhenghongdong paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition AT juewang paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition AT luruixia paralleltemporalfeatureselectionbasedonimprovedattentionmechanismfordynamicgesturerecognition |