Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos

The accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and r...

Full description

Bibliographic Details
Main Authors: Sohaib Mustafa Saeed, Hassan Akbar, Tahir Nawaz, Hassan Elahi, Umar Shahbaz Khan
Format: Article
Language:English
Published: MDPI AG 2023-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/16/9384
_version_ 1797585601166311424
author Sohaib Mustafa Saeed
Hassan Akbar
Tahir Nawaz
Hassan Elahi
Umar Shahbaz Khan
author_facet Sohaib Mustafa Saeed
Hassan Akbar
Tahir Nawaz
Hassan Elahi
Umar Shahbaz Khan
author_sort Sohaib Mustafa Saeed
collection DOAJ
description The accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and robust solution. To this end, this paper presents an effective action recognition framework for aerial surveillance, employing the YOLOv8-Pose keypoints extraction algorithm and a customized sequential ConvLSTM (Convolutional Long Short-Term Memory) model for classifying the action. We performed a detailed experimental evaluation and comparison on the publicly available Drone Action dataset. The evaluation and comparison of the proposed framework with several existing approaches on the publicly available Drone Action dataset demonstrate its effectiveness, achieving a very encouraging performance. The overall accuracy of the framework on three provided dataset splits is 74%, 80%, and 70%, with a mean accuracy of 74.67%. Indeed, the proposed system effectively captures the spatial and temporal dynamics of human actions, providing a robust solution for aerial action recognition.
first_indexed 2024-03-11T00:08:14Z
format Article
id doaj.art-3343d9d2c7854aeb9b47d3ad0356a33e
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-03-11T00:08:14Z
publishDate 2023-08-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-3343d9d2c7854aeb9b47d3ad0356a33e2023-11-19T00:08:57ZengMDPI AGApplied Sciences2076-34172023-08-011316938410.3390/app13169384Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial VideosSohaib Mustafa Saeed0Hassan Akbar1Tahir Nawaz2Hassan Elahi3Umar Shahbaz Khan4Department of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanThe accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and robust solution. To this end, this paper presents an effective action recognition framework for aerial surveillance, employing the YOLOv8-Pose keypoints extraction algorithm and a customized sequential ConvLSTM (Convolutional Long Short-Term Memory) model for classifying the action. We performed a detailed experimental evaluation and comparison on the publicly available Drone Action dataset. The evaluation and comparison of the proposed framework with several existing approaches on the publicly available Drone Action dataset demonstrate its effectiveness, achieving a very encouraging performance. The overall accuracy of the framework on three provided dataset splits is 74%, 80%, and 70%, with a mean accuracy of 74.67%. Indeed, the proposed system effectively captures the spatial and temporal dynamics of human actions, providing a robust solution for aerial action recognition.https://www.mdpi.com/2076-3417/13/16/9384deep neural networkconvolutional LSTMaction recognitionbody pose keypointsaerial surveillance
spellingShingle Sohaib Mustafa Saeed
Hassan Akbar
Tahir Nawaz
Hassan Elahi
Umar Shahbaz Khan
Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
Applied Sciences
deep neural network
convolutional LSTM
action recognition
body pose keypoints
aerial surveillance
title Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_full Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_fullStr Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_full_unstemmed Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_short Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
title_sort body pose guided action recognition with convolutional long short term memory lstm in aerial videos
topic deep neural network
convolutional LSTM
action recognition
body pose keypoints
aerial surveillance
url https://www.mdpi.com/2076-3417/13/16/9384
work_keys_str_mv AT sohaibmustafasaeed bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos
AT hassanakbar bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos
AT tahirnawaz bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos
AT hassanelahi bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos
AT umarshahbazkhan bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos