Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation

Human Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or gray...

Full description

Bibliographic Details
Main Authors: Chu Xin, Seokhwan Kim, Yongjoo Cho, Kyoung Shin Park
Format: Article
Language:English
Published: MDPI AG 2024-02-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/13/4/747
_version_ 1797298435538288640
author Chu Xin
Seokhwan Kim
Yongjoo Cho
Kyoung Shin Park
author_facet Chu Xin
Seokhwan Kim
Yongjoo Cho
Kyoung Shin Park
author_sort Chu Xin
collection DOAJ
description Human Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or grayscale image-based HAR. Data augmentation is a key technique to enhance model generalization and robustness in deep learning while suppressing overfitting to training data. In this paper, we conduct a comprehensive study of various data augmentation techniques specific to skeletal data, which aim to improve the accuracy of deep learning models. These augmentation methods include spatial augmentation, which generates augmented samples from the original 3D skeleton sequence, and temporal augmentation, which is designed to capture subtle temporal changes in motion. The evaluation covers two publicly available datasets and a proprietary dataset and employs three neural network models. The results highlight the impact of temporal augmentation on model performance on the skeleton datasets, while exhibiting the nuanced impact of spatial augmentation. The findings underscore the importance of tailoring augmentation strategies to specific dataset characteristics and actions, providing novel perspectives for model selection in skeleton-based human action recognition tasks.
first_indexed 2024-03-07T22:34:50Z
format Article
id doaj.art-be487f8334e34e22af9de62324202fca
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-07T22:34:50Z
publishDate 2024-02-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-be487f8334e34e22af9de62324202fca2024-02-23T15:14:49ZengMDPI AGElectronics2079-92922024-02-0113474710.3390/electronics13040747Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data AugmentationChu Xin0Seokhwan Kim1Yongjoo Cho2Kyoung Shin Park3Department of Artificial Intelligence Convergence, Graduate School, Dankook University, 152 Jukjeon-ro, Suji-gu, Yongin-si 16890, Republic of KoreaFarmkit R&D Center, 502, 55, Heungdeokjooang-ro, Giheung-gu, Yongin-si 16953, Republic of KoreaDepartment of Computer Science, Sangmyung University, 20 Hongjimoon-2gil, Jongno-gu, Seoul 03016, Republic of KoreaDepartment of Computer Engineering, Dankook University, 152 Jukjeon-ro, Suji-gu, Yongin-si 16890, Republic of KoreaHuman Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or grayscale image-based HAR. Data augmentation is a key technique to enhance model generalization and robustness in deep learning while suppressing overfitting to training data. In this paper, we conduct a comprehensive study of various data augmentation techniques specific to skeletal data, which aim to improve the accuracy of deep learning models. These augmentation methods include spatial augmentation, which generates augmented samples from the original 3D skeleton sequence, and temporal augmentation, which is designed to capture subtle temporal changes in motion. The evaluation covers two publicly available datasets and a proprietary dataset and employs three neural network models. The results highlight the impact of temporal augmentation on model performance on the skeleton datasets, while exhibiting the nuanced impact of spatial augmentation. The findings underscore the importance of tailoring augmentation strategies to specific dataset characteristics and actions, providing novel perspectives for model selection in skeleton-based human action recognition tasks.https://www.mdpi.com/2079-9292/13/4/747data augmentationskeleton-based human action recognitionCNNLSTMCNNLSTM
spellingShingle Chu Xin
Seokhwan Kim
Yongjoo Cho
Kyoung Shin Park
Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
Electronics
data augmentation
skeleton-based human action recognition
CNN
LSTM
CNNLSTM
title Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
title_full Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
title_fullStr Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
title_full_unstemmed Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
title_short Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
title_sort enhancing human action recognition with 3d skeleton data a comprehensive study of deep learning and data augmentation
topic data augmentation
skeleton-based human action recognition
CNN
LSTM
CNNLSTM
url https://www.mdpi.com/2079-9292/13/4/747
work_keys_str_mv AT chuxin enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation
AT seokhwankim enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation
AT yongjoocho enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation
AT kyoungshinpark enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation