Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos
The accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and r...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-08-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/16/9384 |
_version_ | 1797585601166311424 |
---|---|
author | Sohaib Mustafa Saeed Hassan Akbar Tahir Nawaz Hassan Elahi Umar Shahbaz Khan |
author_facet | Sohaib Mustafa Saeed Hassan Akbar Tahir Nawaz Hassan Elahi Umar Shahbaz Khan |
author_sort | Sohaib Mustafa Saeed |
collection | DOAJ |
description | The accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and robust solution. To this end, this paper presents an effective action recognition framework for aerial surveillance, employing the YOLOv8-Pose keypoints extraction algorithm and a customized sequential ConvLSTM (Convolutional Long Short-Term Memory) model for classifying the action. We performed a detailed experimental evaluation and comparison on the publicly available Drone Action dataset. The evaluation and comparison of the proposed framework with several existing approaches on the publicly available Drone Action dataset demonstrate its effectiveness, achieving a very encouraging performance. The overall accuracy of the framework on three provided dataset splits is 74%, 80%, and 70%, with a mean accuracy of 74.67%. Indeed, the proposed system effectively captures the spatial and temporal dynamics of human actions, providing a robust solution for aerial action recognition. |
first_indexed | 2024-03-11T00:08:14Z |
format | Article |
id | doaj.art-3343d9d2c7854aeb9b47d3ad0356a33e |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T00:08:14Z |
publishDate | 2023-08-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-3343d9d2c7854aeb9b47d3ad0356a33e2023-11-19T00:08:57ZengMDPI AGApplied Sciences2076-34172023-08-011316938410.3390/app13169384Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial VideosSohaib Mustafa Saeed0Hassan Akbar1Tahir Nawaz2Hassan Elahi3Umar Shahbaz Khan4Department of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanDepartment of Mechatronics Engineering, National University of Sciences and Technology (NUST), Islamabad 44000, PakistanThe accurate detection and recognition of human actions play a pivotal role in aerial surveillance, enabling the identification of potential threats and suspicious behavior. Several approaches have been presented to address this problem, but the limitation still remains in devising an accurate and robust solution. To this end, this paper presents an effective action recognition framework for aerial surveillance, employing the YOLOv8-Pose keypoints extraction algorithm and a customized sequential ConvLSTM (Convolutional Long Short-Term Memory) model for classifying the action. We performed a detailed experimental evaluation and comparison on the publicly available Drone Action dataset. The evaluation and comparison of the proposed framework with several existing approaches on the publicly available Drone Action dataset demonstrate its effectiveness, achieving a very encouraging performance. The overall accuracy of the framework on three provided dataset splits is 74%, 80%, and 70%, with a mean accuracy of 74.67%. Indeed, the proposed system effectively captures the spatial and temporal dynamics of human actions, providing a robust solution for aerial action recognition.https://www.mdpi.com/2076-3417/13/16/9384deep neural networkconvolutional LSTMaction recognitionbody pose keypointsaerial surveillance |
spellingShingle | Sohaib Mustafa Saeed Hassan Akbar Tahir Nawaz Hassan Elahi Umar Shahbaz Khan Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos Applied Sciences deep neural network convolutional LSTM action recognition body pose keypoints aerial surveillance |
title | Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos |
title_full | Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos |
title_fullStr | Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos |
title_full_unstemmed | Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos |
title_short | Body-Pose-Guided Action Recognition with Convolutional Long Short-Term Memory (LSTM) in Aerial Videos |
title_sort | body pose guided action recognition with convolutional long short term memory lstm in aerial videos |
topic | deep neural network convolutional LSTM action recognition body pose keypoints aerial surveillance |
url | https://www.mdpi.com/2076-3417/13/16/9384 |
work_keys_str_mv | AT sohaibmustafasaeed bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT hassanakbar bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT tahirnawaz bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT hassanelahi bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos AT umarshahbazkhan bodyposeguidedactionrecognitionwithconvolutionallongshorttermmemorylstminaerialvideos |