Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos

The accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and r...

Full description

Bibliographic Details
Main Authors:	Sohaib Mustafa Saeed, Hassan Akbar, Tahir Nawaz, Hassan Elahi, Umar Shahbaz Khan
Format:	Article
Language:	English
Published:	MDPI AG 2023-08-01
Series:	Applied Sciences
Subjects:	deep neural network convolutional LSTM action recognition body pose keypoints aerial surveillance
Online Access:	https://www.mdpi.com/2076-3417/13/16/9384

_version_	1827730464267304960
author	Sohaib Mustafa Saeed Hassan Akbar Tahir Nawaz Hassan Elahi Umar Shahbaz Khan
author_facet	Sohaib Mustafa Saeed Hassan Akbar Tahir Nawaz Hassan Elahi Umar Shahbaz Khan
author_sort	Sohaib Mustafa Saeed
collection	DOAJ
description	The accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and robust solution. To this end, this paper presents an effective action recognition framework for aerial surveillance, employing the YOLOv8-Pose keypoints extraction algorithm and a customized sequential ConvLSTM (Convolutional Long Short-Term Memory) model for classifying the action. We performed a detailed experimental evaluation and comparison on the publicly available Drone Action dataset. The evaluation and comparison of the proposed framework with several existing approaches on the publicly available Drone Action dataset demonstrate its effectiveness, achieving a very encouraging performance. The overall accuracy of the framework on three provided dataset splits is 74%, 80%, and 70%, with a mean accuracy of 74.67%. Indeed, the proposed system effectively captures the spatial and temporal dynamics of human actions, providing a robust solution for aerial action recognition.
first_indexed	2024-03-11T00:08:14Z
format	Article
id	doaj.art-3343d9d2c7854aeb9b47d3ad0356a33e
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-11T00:08:14Z
publishDate	2023-08-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-3343d9d2c7854aeb9b47d3ad0356a33e2023-11-19T00:08:57ZengMDPI AGApplied Sciences2076-34172023-08-011316938410.3390/app13169384Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial VideosSohaib Mustafa Saeed0Hassan Akbar1Tahir Nawaz2Hassan Elahi3Umar Shahbaz Khan4Department of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanThe accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and robust solution. To this end, this paper presents an effective action recognition framework for aerial surveillance, employing the YOLOv8-Pose keypoints extraction algorithm and a customized sequential ConvLSTM (Convolutional Long Short-Term Memory) model for classifying the action. We performed a detailed experimental evaluation and comparison on the publicly available Drone Action dataset. The evaluation and comparison of the proposed framework with several existing approaches on the publicly available Drone Action dataset demonstrate its effectiveness, achieving a very encouraging performance. The overall accuracy of the framework on three provided dataset splits is 74%, 80%, and 70%, with a mean accuracy of 74.67%. Indeed, the proposed system effectively captures the spatial and temporal dynamics of human actions, providing a robust solution for aerial action recognition.https://www.mdpi.com/2076-3417/13/16/9384deep neural networkconvolutional LSTMaction recognitionbody pose keypointsaerial surveillance
spellingShingle	Sohaib Mustafa Saeed Hassan Akbar Tahir Nawaz Hassan Elahi Umar Shahbaz Khan Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos Applied Sciences deep neural network convolutional LSTM action recognition body pose keypoints aerial surveillance
title	Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_full	Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_fullStr	Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_full_unstemmed	Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_short	Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_sort	body pose guided action recognition with convolutional long short term memory lstm in aerial videos
topic	deep neural network convolutional LSTM action recognition body pose keypoints aerial surveillance
url	https://www.mdpi.com/2076-3417/13/16/9384
work_keys_str_mv	AT sohaibmustafasaeed bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT hassanakbar bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT tahirnawaz bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT hassanelahi bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT umarshahbazkhan bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos

Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos

Similar Items