Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition
Fitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-05-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/23/10/4741 |
_version_ | 1797598394874593280 |
---|---|
author | Guixiang Wei Huijian Zhou Liping Zhang Jianji Wang |
author_facet | Guixiang Wei Huijian Zhou Liping Zhang Jianji Wang |
author_sort | Guixiang Wei |
collection | DOAJ |
description | Fitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we propose spatial–temporal self-attention enhanced graph convolutional networks (STSAE-GCNs) that can analyze RGB yoga video data captured by cameras or smartphones. In the STSAE-GCN, we build a spatial–temporal self-attention module (STSAM), which can effectively enhance the spatial–temporal expression ability of the model and improve the performance of the proposed model. The STSAM has the characteristics of plug-and-play so that it can be applied in other skeleton-based action recognition methods and improve their performance. To prove the effectiveness of the proposed model in recognizing fitness yoga actions, we collected 960 fitness yoga action video clips in 10 action classes and built the dataset Yoga10. The recognition accuracy of the model on Yoga10 achieves 93.83%, outperforming the state-of-the-art methods, which proves that this model can better recognize fitness yoga actions and help students learn fitness yoga independently. |
first_indexed | 2024-03-11T03:20:35Z |
format | Article |
id | doaj.art-99e5aef08b2f4e139211c5d7153aaefe |
institution | Directory Open Access Journal |
issn | 1424-8220 |
language | English |
last_indexed | 2024-03-11T03:20:35Z |
publishDate | 2023-05-01 |
publisher | MDPI AG |
record_format | Article |
series | Sensors |
spelling | doaj.art-99e5aef08b2f4e139211c5d7153aaefe2023-11-18T03:11:52ZengMDPI AGSensors1424-82202023-05-012310474110.3390/s23104741Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action RecognitionGuixiang Wei0Huijian Zhou1Liping Zhang2Jianji Wang3School of Sports Center, Xi’an Jiaotong University, Xi’an 710000, ChinaSchool of Software Engineering, Xi’an Jiaotong University, Xi’an 710000, ChinaSchool of Sports Center, Xi’an Jiaotong University, Xi’an 710000, ChinaInstitute of Artificial Intelligence and Robotics, Xi’an Jiaotong University, Xi’an 710000, ChinaFitness yoga is now a popular form of national fitness and sportive physical therapy. At present, Microsoft Kinect, a depth sensor, and other applications are widely used to monitor and guide yoga performance, but they are inconvenient to use and still a little expensive. To solve these problems, we propose spatial–temporal self-attention enhanced graph convolutional networks (STSAE-GCNs) that can analyze RGB yoga video data captured by cameras or smartphones. In the STSAE-GCN, we build a spatial–temporal self-attention module (STSAM), which can effectively enhance the spatial–temporal expression ability of the model and improve the performance of the proposed model. The STSAM has the characteristics of plug-and-play so that it can be applied in other skeleton-based action recognition methods and improve their performance. To prove the effectiveness of the proposed model in recognizing fitness yoga actions, we collected 960 fitness yoga action video clips in 10 action classes and built the dataset Yoga10. The recognition accuracy of the model on Yoga10 achieves 93.83%, outperforming the state-of-the-art methods, which proves that this model can better recognize fitness yoga actions and help students learn fitness yoga independently.https://www.mdpi.com/1424-8220/23/10/4741fitness yogahuman action recognitionself-attention mechanism |
spellingShingle | Guixiang Wei Huijian Zhou Liping Zhang Jianji Wang Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition Sensors fitness yoga human action recognition self-attention mechanism |
title | Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition |
title_full | Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition |
title_fullStr | Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition |
title_full_unstemmed | Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition |
title_short | Spatial–Temporal Self-Attention Enhanced Graph Convolutional Networks for Fitness Yoga Action Recognition |
title_sort | spatial temporal self attention enhanced graph convolutional networks for fitness yoga action recognition |
topic | fitness yoga human action recognition self-attention mechanism |
url | https://www.mdpi.com/1424-8220/23/10/4741 |
work_keys_str_mv | AT guixiangwei spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition AT huijianzhou spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition AT lipingzhang spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition AT jianjiwang spatialtemporalselfattentionenhancedgraphconvolutionalnetworksforfitnessyogaactionrecognition |