Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos

Video classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination condit...

Full description

Bibliographic Details
Main Authors: Sandeli Priyanwada Kasthuri Arachchi, Timothy K. Shih, Noorkholis Luthfil Hakim
Format: Article
Language:English
Published: MDPI AG 2020-02-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/10/4/1479
_version_ 1818327621361991680
author Sandeli Priyanwada Kasthuri Arachchi
Timothy K. Shih
Noorkholis Luthfil Hakim
author_facet Sandeli Priyanwada Kasthuri Arachchi
Timothy K. Shih
Noorkholis Luthfil Hakim
author_sort Sandeli Priyanwada Kasthuri Arachchi
collection DOAJ
description Video classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination conditions. Prior studies of video classifications focused on the relationship between the standalone streams themselves. In this paper, by leveraging the effects of deep learning methodologies, we propose a two-stream neural network concept, named state-exchanging long short-term memory (SE-LSTM). With the model of spatial motion state-exchanging, the SE-LSTM can classify dynamic patterns of videos using appearance and motion features. The SE-LSTM extends the general purpose of LSTM by exchanging the information with previous cell states of both appearance and motion stream. We propose a novel two-stream model Dual-CNNSELSTM utilizing the SE-LSTM concept combined with a Convolutional Neural Network, and use various video datasets to validate the proposed architecture. The experimental results demonstrate that the performance of the proposed two-stream Dual-CNNSELSTM architecture significantly outperforms other datasets, achieving accuracies of 81.62%, 79.87%, and 69.86% with hand gestures, fireworks displays, and HMDB51 datasets, respectively. Furthermore, the overall results signify that the proposed model is most suited to static background dynamic patterns classifications.
first_indexed 2024-12-13T12:19:11Z
format Article
id doaj.art-f66ecea812924c5eba44893eafe49d81
institution Directory Open Access Journal
issn 2076-3417
language English
last_indexed 2024-12-13T12:19:11Z
publishDate 2020-02-01
publisher MDPI AG
record_format Article
series Applied Sciences
spelling doaj.art-f66ecea812924c5eba44893eafe49d812022-12-21T23:46:38ZengMDPI AGApplied Sciences2076-34172020-02-01104147910.3390/app10041479app10041479Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of VideosSandeli Priyanwada Kasthuri Arachchi0Timothy K. Shih1Noorkholis Luthfil Hakim2Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanVideo classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination conditions. Prior studies of video classifications focused on the relationship between the standalone streams themselves. In this paper, by leveraging the effects of deep learning methodologies, we propose a two-stream neural network concept, named state-exchanging long short-term memory (SE-LSTM). With the model of spatial motion state-exchanging, the SE-LSTM can classify dynamic patterns of videos using appearance and motion features. The SE-LSTM extends the general purpose of LSTM by exchanging the information with previous cell states of both appearance and motion stream. We propose a novel two-stream model Dual-CNNSELSTM utilizing the SE-LSTM concept combined with a Convolutional Neural Network, and use various video datasets to validate the proposed architecture. The experimental results demonstrate that the performance of the proposed two-stream Dual-CNNSELSTM architecture significantly outperforms other datasets, achieving accuracies of 81.62%, 79.87%, and 69.86% with hand gestures, fireworks displays, and HMDB51 datasets, respectively. Furthermore, the overall results signify that the proposed model is most suited to static background dynamic patterns classifications.https://www.mdpi.com/2076-3417/10/4/1479dynamic pattern classificationdeep learningspatiotemporal dataconvolution neural networkrecurrent neural network
spellingShingle Sandeli Priyanwada Kasthuri Arachchi
Timothy K. Shih
Noorkholis Luthfil Hakim
Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
Applied Sciences
dynamic pattern classification
deep learning
spatiotemporal data
convolution neural network
recurrent neural network
title Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_full Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_fullStr Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_full_unstemmed Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_short Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_sort modelling a spatial motion deep learning framework to classify dynamic patterns of videos
topic dynamic pattern classification
deep learning
spatiotemporal data
convolution neural network
recurrent neural network
url https://www.mdpi.com/2076-3417/10/4/1479
work_keys_str_mv AT sandelipriyanwadakasthuriarachchi modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos
AT timothykshih modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos
AT noorkholisluthfilhakim modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos