Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation
Human Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or gray...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/13/4/747 |
_version_ | 1797298435538288640 |
---|---|
author | Chu Xin Seokhwan Kim Yongjoo Cho Kyoung Shin Park |
author_facet | Chu Xin Seokhwan Kim Yongjoo Cho Kyoung Shin Park |
author_sort | Chu Xin |
collection | DOAJ |
description | Human Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or grayscale image-based HAR. Data augmentation is a key technique to enhance model generalization and robustness in deep learning while suppressing overfitting to training data. In this paper, we conduct a comprehensive study of various data augmentation techniques specific to skeletal data, which aim to improve the accuracy of deep learning models. These augmentation methods include spatial augmentation, which generates augmented samples from the original 3D skeleton sequence, and temporal augmentation, which is designed to capture subtle temporal changes in motion. The evaluation covers two publicly available datasets and a proprietary dataset and employs three neural network models. The results highlight the impact of temporal augmentation on model performance on the skeleton datasets, while exhibiting the nuanced impact of spatial augmentation. The findings underscore the importance of tailoring augmentation strategies to specific dataset characteristics and actions, providing novel perspectives for model selection in skeleton-based human action recognition tasks. |
first_indexed | 2024-03-07T22:34:50Z |
format | Article |
id | doaj.art-be487f8334e34e22af9de62324202fca |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-03-07T22:34:50Z |
publishDate | 2024-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-be487f8334e34e22af9de62324202fca2024-02-23T15:14:49ZengMDPI AGElectronics2079-92922024-02-0113474710.3390/electronics13040747Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data AugmentationChu Xin0Seokhwan Kim1Yongjoo Cho2Kyoung Shin Park3Department of Artificial Intelligence Convergence, Graduate School, Dankook University, 152 Jukjeon-ro, Suji-gu, Yongin-si 16890, Republic of KoreaFarmkit R&D Center, 502, 55, Heungdeokjooang-ro, Giheung-gu, Yongin-si 16953, Republic of KoreaDepartment of Computer Science, Sangmyung University, 20 Hongjimoon-2gil, Jongno-gu, Seoul 03016, Republic of KoreaDepartment of Computer Engineering, Dankook University, 152 Jukjeon-ro, Suji-gu, Yongin-si 16890, Republic of KoreaHuman Action Recognition (HAR) is an important field that identifies human behavior through sensor data. Three-dimensional human skeleton data extracted from the Kinect depth sensor have emerged as a powerful alternative to mitigate the effects of lighting and occlusion of traditional 2D RGB or grayscale image-based HAR. Data augmentation is a key technique to enhance model generalization and robustness in deep learning while suppressing overfitting to training data. In this paper, we conduct a comprehensive study of various data augmentation techniques specific to skeletal data, which aim to improve the accuracy of deep learning models. These augmentation methods include spatial augmentation, which generates augmented samples from the original 3D skeleton sequence, and temporal augmentation, which is designed to capture subtle temporal changes in motion. The evaluation covers two publicly available datasets and a proprietary dataset and employs three neural network models. The results highlight the impact of temporal augmentation on model performance on the skeleton datasets, while exhibiting the nuanced impact of spatial augmentation. The findings underscore the importance of tailoring augmentation strategies to specific dataset characteristics and actions, providing novel perspectives for model selection in skeleton-based human action recognition tasks.https://www.mdpi.com/2079-9292/13/4/747data augmentationskeleton-based human action recognitionCNNLSTMCNNLSTM |
spellingShingle | Chu Xin Seokhwan Kim Yongjoo Cho Kyoung Shin Park Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation Electronics data augmentation skeleton-based human action recognition CNN LSTM CNNLSTM |
title | Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation |
title_full | Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation |
title_fullStr | Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation |
title_full_unstemmed | Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation |
title_short | Enhancing Human Action Recognition with 3D Skeleton Data: A Comprehensive Study of Deep Learning and Data Augmentation |
title_sort | enhancing human action recognition with 3d skeleton data a comprehensive study of deep learning and data augmentation |
topic | data augmentation skeleton-based human action recognition CNN LSTM CNNLSTM |
url | https://www.mdpi.com/2079-9292/13/4/747 |
work_keys_str_mv | AT chuxin enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation AT seokhwankim enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation AT yongjoocho enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation AT kyoungshinpark enhancinghumanactionrecognitionwith3dskeletondataacomprehensivestudyofdeeplearninganddataaugmentation |