Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition

Fitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we...

Full description

Bibliographic Details
Main Authors: Guixiang Wei, Huijian Zhou, Liping Zhang, Jianji Wang
Format: Article
Language:English
Published: MDPI AG 2023-05-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/23/10/4741
_version_ 1797598394874593280
author Guixiang Wei
Huijian Zhou
Liping Zhang
Jianji Wang
author_facet Guixiang Wei
Huijian Zhou
Liping Zhang
Jianji Wang
author_sort Guixiang Wei
collection DOAJ
description Fitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we propose spatial–temporal self-attention enhanced graph convolutional networks (STSAE-GCNs) that can analyze RGB yoga video data captured by cameras or smartphones. In the STSAE-GCN, we build a spatial–temporal self-attention module (STSAM), which can effectively enhance the spatial–temporal expression ability of the model and improve the performance of the proposed model. The STSAM has the characteristics of plug-and-play so that it can be applied in other skeleton-based action recognition methods and improve their performance. To prove the effectiveness of the proposed model in recognizing fitness yoga actions, we collected 960 fitness yoga action video clips in 10 action classes and built the dataset Yoga10. The recognition accuracy of the model on Yoga10 achieves 93.83%, outperforming the state-of-the-art methods, which proves that this model can better recognize fitness yoga actions and help students learn fitness yoga independently.
first_indexed 2024-03-11T03:20:35Z
format Article
id doaj.art-99e5aef08b2f4e139211c5d7153aaefe
institution Directory Open Access Journal
issn 1424-8220
language English
last_indexed 2024-03-11T03:20:35Z
publishDate 2023-05-01
publisher MDPI AG
record_format Article
series Sensors
spelling doaj.art-99e5aef08b2f4e139211c5d7153aaefe2023-11-18T03:11:52ZengMDPI AGSensors1424-82202023-05-012310474110.3390/s23104741Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action RecognitionGuixiang Wei0Huijian Zhou1Liping Zhang2Jianji Wang3School of Sports Center, Xi’an Jiaotong University, Xi’an 710000, ChinaSchool of Software Engineering, Xi’an Jiaotong University, Xi’an 710000, ChinaSchool of Sports Center, Xi’an Jiaotong University, Xi’an 710000, ChinaInstitute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an 710000, ChinaFitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we propose spatial–temporal self-attention enhanced graph convolutional networks (STSAE-GCNs) that can analyze RGB yoga video data captured by cameras or smartphones. In the STSAE-GCN, we build a spatial–temporal self-attention module (STSAM), which can effectively enhance the spatial–temporal expression ability of the model and improve the performance of the proposed model. The STSAM has the characteristics of plug-and-play so that it can be applied in other skeleton-based action recognition methods and improve their performance. To prove the effectiveness of the proposed model in recognizing fitness yoga actions, we collected 960 fitness yoga action video clips in 10 action classes and built the dataset Yoga10. The recognition accuracy of the model on Yoga10 achieves 93.83%, outperforming the state-of-the-art methods, which proves that this model can better recognize fitness yoga actions and help students learn fitness yoga independently.https://www.mdpi.com/1424-8220/23/10/4741fitness yogahuman action recognitionself-attention mechanism
spellingShingle Guixiang Wei
Huijian Zhou
Liping Zhang
Jianji Wang
Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
Sensors
fitness yoga
human action recognition
self-attention mechanism
title Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_full Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_fullStr Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_full_unstemmed Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_short Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
title_sort spatial temporal self attention enhanced graph convolutional networks for fitness yoga action recognition
topic fitness yoga
human action recognition
self-attention mechanism
url https://www.mdpi.com/1424-8220/23/10/4741
work_keys_str_mv AT guixiangwei spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition
AT huijianzhou spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition
AT lipingzhang spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition
AT jianjiwang spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition