A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model
Computer vision technologies have greatly improved in the last few years. Many problems have been solved using deep learning merged with more computational power. Action recognition is one of society's problems that must be addressed. Human Action Recognition (HAR) may be adopted for inte...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2024-01-01
|
Series: | IEEE Open Journal of the Computer Society |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10323158/ |
_version_ | 1797373149908566016 |
---|---|
author | Chhaya Gupta Nasib Singh Gill Preeti Gulia Sangeeta Yadav Giovanni Pau Mohammad Alibakhshikenari Xiangjie Kong |
author_facet | Chhaya Gupta Nasib Singh Gill Preeti Gulia Sangeeta Yadav Giovanni Pau Mohammad Alibakhshikenari Xiangjie Kong |
author_sort | Chhaya Gupta |
collection | DOAJ |
description | Computer vision technologies have greatly improved in the last few years. Many problems have been solved using deep learning merged with more computational power. Action recognition is one of society's problems that must be addressed. Human Action Recognition (HAR) may be adopted for intelligent video surveillance systems, and the government may use the same for monitoring crimes and security purposes. This paper proposes a deep learning-based HAR model, i.e., a 3-dimensional Convolutional Network with multiplicative LSTM. The suggested model makes it easier to comprehend the tasks that an individual or team of individuals completes. The four-phase proposed model consists of a 3D Convolutional neural network (3DCNN) combined with an LSTM multiplicative recurrent network and Yolov6 for real-time object detection. The four stages of the proposed model are data fusion, feature extraction, object identification, and skeleton articulation approaches. The NTU-RGB-D, KITTI, NTU-RGB-D 120, UCF 101, and Fused datasets are some used to train the model. The suggested model surpasses other cutting-edge models by reaching an accuracy of 98.23%, 97.65%, 98.76%, 95.45%, and 97.65% on the abovementioned datasets. Other state-of-the-art (SOTA) methods compared in this study are traditional CNN, Yolov6, and CNN with BiLSTM. The results verify that actions are classified more accurately by the proposed model that combines all these techniques compared to existing ones. |
first_indexed | 2024-03-08T18:46:10Z |
format | Article |
id | doaj.art-8b26e7c3c97342e7b9b5c0b04b6ce85b |
institution | Directory Open Access Journal |
issn | 2644-1268 |
language | English |
last_indexed | 2024-03-08T18:46:10Z |
publishDate | 2024-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Open Journal of the Computer Society |
spelling | doaj.art-8b26e7c3c97342e7b9b5c0b04b6ce85b2023-12-29T00:04:06ZengIEEEIEEE Open Journal of the Computer Society2644-12682024-01-015142610.1109/OJCS.2023.333452810323158A Real-Time 3-Dimensional Object Detection Based Human Action Recognition ModelChhaya Gupta0https://orcid.org/0000-0002-8620-2927Nasib Singh Gill1https://orcid.org/0000-0002-8594-4320Preeti Gulia2https://orcid.org/0000-0001-8535-4016Sangeeta Yadav3https://orcid.org/0000-0003-2625-8096Giovanni Pau4https://orcid.org/0000-0002-5798-398XMohammad Alibakhshikenari5https://orcid.org/0000-0002-8263-1572Xiangjie Kong6https://orcid.org/0000-0003-2698-3319Department of Computer Science and Applications, Maharshi Dayanand University, Rohtak, IndiaDepartment of Computer Science and Applications, Maharshi Dayanand University, Rohtak, IndiaDepartment of Computer Science and Applications, Maharshi Dayanand University, Rohtak, IndiaDepartment of Computer Science and Applications, Maharshi Dayanand University, Rohtak, IndiaFaculty of Engineering and Architecture, Kore University, Enna, ItalyDepartment of Signal Theory Communications, Universidad Carlos III Madrid, Getafe, SpainCollege of Computer Science and Technology, Zhejiang University, Hangzhou, ChinaComputer vision technologies have greatly improved in the last few years. Many problems have been solved using deep learning merged with more computational power. Action recognition is one of society's problems that must be addressed. Human Action Recognition (HAR) may be adopted for intelligent video surveillance systems, and the government may use the same for monitoring crimes and security purposes. This paper proposes a deep learning-based HAR model, i.e., a 3-dimensional Convolutional Network with multiplicative LSTM. The suggested model makes it easier to comprehend the tasks that an individual or team of individuals completes. The four-phase proposed model consists of a 3D Convolutional neural network (3DCNN) combined with an LSTM multiplicative recurrent network and Yolov6 for real-time object detection. The four stages of the proposed model are data fusion, feature extraction, object identification, and skeleton articulation approaches. The NTU-RGB-D, KITTI, NTU-RGB-D 120, UCF 101, and Fused datasets are some used to train the model. The suggested model surpasses other cutting-edge models by reaching an accuracy of 98.23%, 97.65%, 98.76%, 95.45%, and 97.65% on the abovementioned datasets. Other state-of-the-art (SOTA) methods compared in this study are traditional CNN, Yolov6, and CNN with BiLSTM. The results verify that actions are classified more accurately by the proposed model that combines all these techniques compared to existing ones.https://ieeexplore.ieee.org/document/10323158/CNNfeature extractionhuman action recognitionmultiplicative LSTMskeleton articulation |
spellingShingle | Chhaya Gupta Nasib Singh Gill Preeti Gulia Sangeeta Yadav Giovanni Pau Mohammad Alibakhshikenari Xiangjie Kong A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model IEEE Open Journal of the Computer Society CNN feature extraction human action recognition multiplicative LSTM skeleton articulation |
title | A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model |
title_full | A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model |
title_fullStr | A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model |
title_full_unstemmed | A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model |
title_short | A Real-Time 3-Dimensional Object Detection Based Human Action Recognition Model |
title_sort | real time 3 dimensional object detection based human action recognition model |
topic | CNN feature extraction human action recognition multiplicative LSTM skeleton articulation |
url | https://ieeexplore.ieee.org/document/10323158/ |
work_keys_str_mv | AT chhayagupta arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT nasibsinghgill arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT preetigulia arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT sangeetayadav arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT giovannipau arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT mohammadalibakhshikenari arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT xiangjiekong arealtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT chhayagupta realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT nasibsinghgill realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT preetigulia realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT sangeetayadav realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT giovannipau realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT mohammadalibakhshikenari realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel AT xiangjiekong realtime3dimensionalobjectdetectionbasedhumanactionrecognitionmodel |