Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos

Video classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination condit...

Full description

Bibliographic Details
Main Authors:	Sandeli Priyanwada Kasthuri Arachchi, Timothy K. Shih, Noorkholis Luthfil Hakim
Format:	Article
Language:	English
Published:	MDPI AG 2020-02-01
Series:	Applied Sciences
Subjects:	dynamic pattern classification deep learning spatiotemporal data convolution neural network recurrent neural network
Online Access:	https://www.mdpi.com/2076-3417/10/4/1479

_version_	1818327621361991680
author	Sandeli Priyanwada Kasthuri Arachchi Timothy K. Shih Noorkholis Luthfil Hakim
author_facet	Sandeli Priyanwada Kasthuri Arachchi Timothy K. Shih Noorkholis Luthfil Hakim
author_sort	Sandeli Priyanwada Kasthuri Arachchi
collection	DOAJ
description	Video classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination conditions. Prior studies of video classifications focused on the relationship between the standalone streams themselves. In this paper, by leveraging the effects of deep learning methodologies, we propose a two-stream neural network concept, named state-exchanging long short-term memory (SE-LSTM). With the model of spatial motion state-exchanging, the SE-LSTM can classify dynamic patterns of videos using appearance and motion features. The SE-LSTM extends the general purpose of LSTM by exchanging the information with previous cell states of both appearance and motion stream. We propose a novel two-stream model Dual-CNNSELSTM utilizing the SE-LSTM concept combined with a Convolutional Neural Network, and use various video datasets to validate the proposed architecture. The experimental results demonstrate that the performance of the proposed two-stream Dual-CNNSELSTM architecture significantly outperforms other datasets, achieving accuracies of 81.62%, 79.87%, and 69.86% with hand gestures, fireworks displays, and HMDB51 datasets, respectively. Furthermore, the overall results signify that the proposed model is most suited to static background dynamic patterns classifications.
first_indexed	2024-12-13T12:19:11Z
format	Article
id	doaj.art-f66ecea812924c5eba44893eafe49d81
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-12-13T12:19:11Z
publishDate	2020-02-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-f66ecea812924c5eba44893eafe49d812022-12-21T23:46:38ZengMDPI AGApplied Sciences2076-34172020-02-01104147910.3390/app10041479app10041479Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of VideosSandeli Priyanwada Kasthuri Arachchi0Timothy K. Shih1Noorkholis Luthfil Hakim2Department of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanDepartment of Computer Science and Information Engineering, National Central University, Taoyuan 32001, TaiwanVideo classification is an essential process for analyzing the pervasive semantic information of video content in computer vision. Traditional hand-crafted features are insufficient when classifying complex video information due to the similarity of visual contents with different illumination conditions. Prior studies of video classifications focused on the relationship between the standalone streams themselves. In this paper, by leveraging the effects of deep learning methodologies, we propose a two-stream neural network concept, named state-exchanging long short-term memory (SE-LSTM). With the model of spatial motion state-exchanging, the SE-LSTM can classify dynamic patterns of videos using appearance and motion features. The SE-LSTM extends the general purpose of LSTM by exchanging the information with previous cell states of both appearance and motion stream. We propose a novel two-stream model Dual-CNNSELSTM utilizing the SE-LSTM concept combined with a Convolutional Neural Network, and use various video datasets to validate the proposed architecture. The experimental results demonstrate that the performance of the proposed two-stream Dual-CNNSELSTM architecture significantly outperforms other datasets, achieving accuracies of 81.62%, 79.87%, and 69.86% with hand gestures, fireworks displays, and HMDB51 datasets, respectively. Furthermore, the overall results signify that the proposed model is most suited to static background dynamic patterns classifications.https://www.mdpi.com/2076-3417/10/4/1479dynamic pattern classificationdeep learningspatiotemporal dataconvolution neural networkrecurrent neural network
spellingShingle	Sandeli Priyanwada Kasthuri Arachchi Timothy K. Shih Noorkholis Luthfil Hakim Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos Applied Sciences dynamic pattern classification deep learning spatiotemporal data convolution neural network recurrent neural network
title	Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_full	Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_fullStr	Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_full_unstemmed	Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_short	Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos
title_sort	modelling a spatial motion deep learning framework to classify dynamic patterns of videos
topic	dynamic pattern classification deep learning spatiotemporal data convolution neural network recurrent neural network
url	https://www.mdpi.com/2076-3417/10/4/1479
work_keys_str_mv	AT sandelipriyanwadakasthuriarachchi modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos AT timothykshih modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos AT noorkholisluthfilhakim modellingaspatialmotiondeeplearningframeworktoclassifydynamicpatternsofvideos

Modelling a Spatial-Motion Deep Learning Framework to Classify Dynamic Patterns of Videos

Similar Items