A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion
Recently, there has been growing interest in utilizing skeleton data for human action recognition due to its compact size and ability to capture action characteristics effectively. However, in complex classroom scenarios, student actions encounter challenges such as high inter-class similarity, diff...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10475333/ |
_version_ | 1797231177835216896 |
---|---|
author | Zefang Chen Yang Gao Qiuyan Yan |
author_facet | Zefang Chen Yang Gao Qiuyan Yan |
author_sort | Zefang Chen |
collection | DOAJ |
description | Recently, there has been growing interest in utilizing skeleton data for human action recognition due to its compact size and ability to capture action characteristics effectively. However, in complex classroom scenarios, student actions encounter challenges such as high inter-class similarity, differentiation difficulty, and redundancy, which hinder effective differentiation using existing unidirectional feature splicing multimodal methods. Therefore, we propose a key skeleton points guided classroom action recognition method based on multimodal symmetry fusion. This method is primarily characterized by several innovations. Firstly, we utilize a method called Variable Series Mean to select the most significant key skeleton points of actions. Then, these points are input into a model to learn the relevant weight values, guiding the generation of salient regions in RGB images. Finally, in the data fusion stage, we utilize the Symmetric Multi-Modal optimization function to integrate the three data streams, addressing bias issues arising from unidirectional feature splicing methods. We conducted comprehensive experiments on two datasets: NTU 60 and Classroom. Synthesizing results of multiple methods, our method achieves state-of-the-art performance on the NTU 60 dataset and the second-best performance on the private Classroom dataset. Despite not attaining the highest recognition accuracy on the Classroom dataset, this approach offers substantial benefits in terms of time and storage, providing a real-time solution for recognizing student actions in the classroom. Therefore, our method effectively captures and integrates the representation information from different modalities, enabling accurate recognition of student actions in the classroom. |
first_indexed | 2024-04-24T15:40:14Z |
format | Article |
id | doaj.art-0652a6e7a9ae42af9495a3da739ae7b3 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-24T15:40:14Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-0652a6e7a9ae42af9495a3da739ae7b32024-04-01T23:00:53ZengIEEEIEEE Access2169-35362024-01-0112429214293110.1109/ACCESS.2024.337944910475333A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry FusionZefang Chen0https://orcid.org/0009-0007-3762-3403Yang Gao1Qiuyan Yan2https://orcid.org/0000-0002-5159-4633School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou, ChinaSchool of Computer Science and Technology, China University of Mining and Technology, Xuzhou, ChinaRecently, there has been growing interest in utilizing skeleton data for human action recognition due to its compact size and ability to capture action characteristics effectively. However, in complex classroom scenarios, student actions encounter challenges such as high inter-class similarity, differentiation difficulty, and redundancy, which hinder effective differentiation using existing unidirectional feature splicing multimodal methods. Therefore, we propose a key skeleton points guided classroom action recognition method based on multimodal symmetry fusion. This method is primarily characterized by several innovations. Firstly, we utilize a method called Variable Series Mean to select the most significant key skeleton points of actions. Then, these points are input into a model to learn the relevant weight values, guiding the generation of salient regions in RGB images. Finally, in the data fusion stage, we utilize the Symmetric Multi-Modal optimization function to integrate the three data streams, addressing bias issues arising from unidirectional feature splicing methods. We conducted comprehensive experiments on two datasets: NTU 60 and Classroom. Synthesizing results of multiple methods, our method achieves state-of-the-art performance on the NTU 60 dataset and the second-best performance on the private Classroom dataset. Despite not attaining the highest recognition accuracy on the Classroom dataset, this approach offers substantial benefits in terms of time and storage, providing a real-time solution for recognizing student actions in the classroom. Therefore, our method effectively captures and integrates the representation information from different modalities, enabling accurate recognition of student actions in the classroom.https://ieeexplore.ieee.org/document/10475333/Action recognitionmultimodalskeleton dataclassroom action |
spellingShingle | Zefang Chen Yang Gao Qiuyan Yan A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion IEEE Access Action recognition multimodal skeleton data classroom action |
title | A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion |
title_full | A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion |
title_fullStr | A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion |
title_full_unstemmed | A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion |
title_short | A Key Skeleton Points Guided Classroom Action Recognition Method Based on Multimodal Symmetry Fusion |
title_sort | key skeleton points guided classroom action recognition method based on multimodal symmetry fusion |
topic | Action recognition multimodal skeleton data classroom action |
url | https://ieeexplore.ieee.org/document/10475333/ |
work_keys_str_mv | AT zefangchen akeyskeletonpointsguidedclassroomactionrecognitionmethodbasedonmultimodalsymmetryfusion AT yanggao akeyskeletonpointsguidedclassroomactionrecognitionmethodbasedonmultimodalsymmetryfusion AT qiuyanyan akeyskeletonpointsguidedclassroomactionrecognitionmethodbasedonmultimodalsymmetryfusion AT zefangchen keyskeletonpointsguidedclassroomactionrecognitionmethodbasedonmultimodalsymmetryfusion AT yanggao keyskeletonpointsguidedclassroomactionrecognitionmethodbasedonmultimodalsymmetryfusion AT qiuyanyan keyskeletonpointsguidedclassroomactionrecognitionmethodbasedonmultimodalsymmetryfusion |