Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition

Fitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we...

Full description

Bibliographic Details
Main Authors:	Guixiang Wei, Huijian Zhou, Liping Zhang, Jianji Wang
Format:	Article
Language:	English
Published:	MDPI AG 2023-05-01
Series:	Sensors
Subjects:	fitness yoga human action recognition self-attention mechanism
Online Access:	https://www.mdpi.com/1424-8220/23/10/4741

_version_	1797598394874593280
author	Guixiang Wei Huijian Zhou Liping Zhang Jianji Wang
author_facet	Guixiang Wei Huijian Zhou Liping Zhang Jianji Wang
author_sort	Guixiang Wei
collection	DOAJ
description	Fitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we propose spatial–temporal self-attention enhanced graph convolutional networks (STSAE-GCNs) that can analyze RGB yoga video data captured by cameras or smartphones. In the STSAE-GCN, we build a spatial–temporal self-attention module (STSAM), which can effectively enhance the spatial–temporal expression ability of the model and improve the performance of the proposed model. The STSAM has the characteristics of plug-and-play so that it can be applied in other skeleton-based action recognition methods and improve their performance. To prove the effectiveness of the proposed model in recognizing fitness yoga actions, we collected 960 fitness yoga action video clips in 10 action classes and built the dataset Yoga10. The recognition accuracy of the model on Yoga10 achieves 93.83%, outperforming the state-of-the-art methods, which proves that this model can better recognize fitness yoga actions and help students learn fitness yoga independently.
first_indexed	2024-03-11T03:20:35Z
format	Article
id	doaj.art-99e5aef08b2f4e139211c5d7153aaefe
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-03-11T03:20:35Z
publishDate	2023-05-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-99e5aef08b2f4e139211c5d7153aaefe2023-11-18T03:11:52ZengMDPI AGSensors1424-82202023-05-012310474110.3390/s23104741Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action RecognitionGuixiang Wei0Huijian Zhou1Liping Zhang2Jianji Wang3School of Sports Center, Xi’an Jiaotong University, Xi’an 710000, ChinaSchool of Software Engineering, Xi’an Jiaotong University, Xi’an 710000, ChinaSchool of Sports Center, Xi’an Jiaotong University, Xi’an 710000, ChinaInstitute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an 710000, ChinaFitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we propose spatial–temporal self-attention enhanced graph convolutional networks (STSAE-GCNs) that can analyze RGB yoga video data captured by cameras or smartphones. In the STSAE-GCN, we build a spatial–temporal self-attention module (STSAM), which can effectively enhance the spatial–temporal expression ability of the model and improve the performance of the proposed model. The STSAM has the characteristics of plug-and-play so that it can be applied in other skeleton-based action recognition methods and improve their performance. To prove the effectiveness of the proposed model in recognizing fitness yoga actions, we collected 960 fitness yoga action video clips in 10 action classes and built the dataset Yoga10. The recognition accuracy of the model on Yoga10 achieves 93.83%, outperforming the state-of-the-art methods, which proves that this model can better recognize fitness yoga actions and help students learn fitness yoga independently.https://www.mdpi.com/1424-8220/23/10/4741fitness yogahuman action recognitionself-attention mechanism
spellingShingle	Guixiang Wei Huijian Zhou Liping Zhang Jianji Wang Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition Sensors fitness yoga human action recognition self-attention mechanism
title	Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_full	Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_fullStr	Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_full_unstemmed	Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_short	Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_sort	spatial temporal self attention enhanced graph convolutional networks for fitness yoga action recognition
topic	fitness yoga human action recognition self-attention mechanism
url	https://www.mdpi.com/1424-8220/23/10/4741
work_keys_str_mv	AT guixiangwei spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition AT huijianzhou spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition AT lipingzhang spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition AT jianjiwang spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition

Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition

Similar Items