Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
Video classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination condit...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-02-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/10/4/1479 |
_version_ | 1818327621361991680 |
---|---|
author | Sandeli Priyanwada Kasthuri Arachchi Timothy K. Shih Noorkholis Luthfil Hakim |
author_facet | Sandeli Priyanwada Kasthuri Arachchi Timothy K. Shih Noorkholis Luthfil Hakim |
author_sort | Sandeli Priyanwada Kasthuri Arachchi |
collection | DOAJ |
description | Video classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination conditions. Prior studies of video classifications focused on the relationship between the standalone streams themselves. In this paper, by leveraging the effects of deep learning methodologies, we propose a two-stream neural network concept, named state-exchanging long short-term memory (SE-LSTM). With the model of spatial motion state-exchanging, the SE-LSTM can classify dynamic patterns of videos using appearance and motion features. The SE-LSTM extends the general purpose of LSTM by exchanging the information with previous cell states of both appearance and motion stream. We propose a novel two-stream model Dual-CNNSELSTM utilizing the SE-LSTM concept combined with a Convolutional Neural Network, and use various video datasets to validate the proposed architecture. The experimental results demonstrate that the performance of the proposed two-stream Dual-CNNSELSTM architecture significantly outperforms other datasets, achieving accuracies of 81.62%, 79.87%, and 69.86% with hand gestures, fireworks displays, and HMDB51 datasets, respectively. Furthermore, the overall results signify that the proposed model is most suited to static background dynamic patterns classifications. |
first_indexed | 2024-12-13T12:19:11Z |
format | Article |
id | doaj.art-f66ecea812924c5eba44893eafe49d81 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-12-13T12:19:11Z |
publishDate | 2020-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-f66ecea812924c5eba44893eafe49d812022-12-21T23:46:38ZengMDPI AGApplied Sciences2076-34172020-02-01104147910.3390/app10041479app10041479Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of VideosSandeli Priyanwada Kasthuri Arachchi0Timothy K. Shih1Noorkholis Luthfil Hakim2Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanVideo classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination conditions. Prior studies of video classifications focused on the relationship between the standalone streams themselves. In this paper, by leveraging the effects of deep learning methodologies, we propose a two-stream neural network concept, named state-exchanging long short-term memory (SE-LSTM). With the model of spatial motion state-exchanging, the SE-LSTM can classify dynamic patterns of videos using appearance and motion features. The SE-LSTM extends the general purpose of LSTM by exchanging the information with previous cell states of both appearance and motion stream. We propose a novel two-stream model Dual-CNNSELSTM utilizing the SE-LSTM concept combined with a Convolutional Neural Network, and use various video datasets to validate the proposed architecture. The experimental results demonstrate that the performance of the proposed two-stream Dual-CNNSELSTM architecture significantly outperforms other datasets, achieving accuracies of 81.62%, 79.87%, and 69.86% with hand gestures, fireworks displays, and HMDB51 datasets, respectively. Furthermore, the overall results signify that the proposed model is most suited to static background dynamic patterns classifications.https://www.mdpi.com/2076-3417/10/4/1479dynamic pattern classificationdeep learningspatiotemporal dataconvolution neural networkrecurrent neural network |
spellingShingle | Sandeli Priyanwada Kasthuri Arachchi Timothy K. Shih Noorkholis Luthfil Hakim Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos Applied Sciences dynamic pattern classification deep learning spatiotemporal data convolution neural network recurrent neural network |
title | Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos |
title_full | Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos |
title_fullStr | Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos |
title_full_unstemmed | Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos |
title_short | Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos |
title_sort | modelling a spatial motion deep learning framework to classify dynamic patterns of videos |
topic | dynamic pattern classification deep learning spatiotemporal data convolution neural network recurrent neural network |
url | https://www.mdpi.com/2076-3417/10/4/1479 |
work_keys_str_mv | AT sandelipriyanwadakasthuriarachchi modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos AT timothykshih modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos AT noorkholisluthfilhakim modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos |