Temporal Spiking Recurrent Neural Network for Action Recognition

In this paper, we propose a novel temporal spiking recurrent neural network (TSRNN) to perform robust action recognition in videos. The proposed TSRNN employs a novel spiking architecture which utilizes the local discriminative features from high-confidence reliable frames as spiking signals. The co...

Full description

Bibliographic Details
Main Authors: Wei Wang, Siyuan Hao, Yunchao Wei, Shengtao Xiao, Jiashi Feng, Nicu Sebe
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8808849/
_version_ 1818877692835004416
author Wei Wang
Siyuan Hao
Yunchao Wei
Shengtao Xiao
Jiashi Feng
Nicu Sebe
author_facet Wei Wang
Siyuan Hao
Yunchao Wei
Shengtao Xiao
Jiashi Feng
Nicu Sebe
author_sort Wei Wang
collection DOAJ
description In this paper, we propose a novel temporal spiking recurrent neural network (TSRNN) to perform robust action recognition in videos. The proposed TSRNN employs a novel spiking architecture which utilizes the local discriminative features from high-confidence reliable frames as spiking signals. The conventional CNN-RNNs typically used for this problem treat all the frames equally important such that they are error-prone to noisy frames. The TSRNN solves this problem by employing a temporal pooling architecture which can help RNN select sparse and reliable frames and enhances its capability in modelling long-range temporal information. Besides, a message passing bridge is added between the spiking signals and the recurrent unit. In this way, the spiking signals can guide RNN to correct its long-term memory across multiple frames from contamination caused by noisy frames with distracting factors (e.g., occlusion, rapid scene transition). With these two novel components, TSRNN achieves competitive performance compared with the state-of-the-art CNN-RNN architectures on two large scale public benchmarks, UCF101 and HMDB51.
first_indexed 2024-12-19T14:02:20Z
format Article
id doaj.art-0d66ed7e61874ce2bc9c0b2cdfef9154
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T14:02:20Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0d66ed7e61874ce2bc9c0b2cdfef91542022-12-21T20:18:25ZengIEEEIEEE Access2169-35362019-01-01711716511717510.1109/ACCESS.2019.29366048808849Temporal Spiking Recurrent Neural Network for Action RecognitionWei Wang0Siyuan Hao1https://orcid.org/0000-0002-5477-1017Yunchao Wei2Shengtao Xiao3Jiashi Feng4Nicu Sebe5Computer Vision Laboratory, École polytechnique fédérale de Lausanne (EPFL), Lausanne, SwitzerlandInformation and Control Engineering College, Qingdao University of Technology, Qingdao, ChinaBeckman Institute, University of Illinois at Urbana–Champaign, Urbana, IL, USADepartment of Electrical and Computer Engineering, National University of Singapore, SingaporeDepartment of Electrical and Computer Engineering, National University of Singapore, SingaporeDepartment of Information Engineering and Computer Science, University of Trento, Trento, ItalyIn this paper, we propose a novel temporal spiking recurrent neural network (TSRNN) to perform robust action recognition in videos. The proposed TSRNN employs a novel spiking architecture which utilizes the local discriminative features from high-confidence reliable frames as spiking signals. The conventional CNN-RNNs typically used for this problem treat all the frames equally important such that they are error-prone to noisy frames. The TSRNN solves this problem by employing a temporal pooling architecture which can help RNN select sparse and reliable frames and enhances its capability in modelling long-range temporal information. Besides, a message passing bridge is added between the spiking signals and the recurrent unit. In this way, the spiking signals can guide RNN to correct its long-term memory across multiple frames from contamination caused by noisy frames with distracting factors (e.g., occlusion, rapid scene transition). With these two novel components, TSRNN achieves competitive performance compared with the state-of-the-art CNN-RNN architectures on two large scale public benchmarks, UCF101 and HMDB51.https://ieeexplore.ieee.org/document/8808849/Action recognitiontemporal spikingrecurrent neural network
spellingShingle Wei Wang
Siyuan Hao
Yunchao Wei
Shengtao Xiao
Jiashi Feng
Nicu Sebe
Temporal Spiking Recurrent Neural Network for Action Recognition
IEEE Access
Action recognition
temporal spiking
recurrent neural network
title Temporal Spiking Recurrent Neural Network for Action Recognition
title_full Temporal Spiking Recurrent Neural Network for Action Recognition
title_fullStr Temporal Spiking Recurrent Neural Network for Action Recognition
title_full_unstemmed Temporal Spiking Recurrent Neural Network for Action Recognition
title_short Temporal Spiking Recurrent Neural Network for Action Recognition
title_sort temporal spiking recurrent neural network for action recognition
topic Action recognition
temporal spiking
recurrent neural network
url https://ieeexplore.ieee.org/document/8808849/
work_keys_str_mv AT weiwang temporalspikingrecurrentneuralnetworkforactionrecognition
AT siyuanhao temporalspikingrecurrentneuralnetworkforactionrecognition
AT yunchaowei temporalspikingrecurrentneuralnetworkforactionrecognition
AT shengtaoxiao temporalspikingrecurrentneuralnetworkforactionrecognition
AT jiashifeng temporalspikingrecurrentneuralnetworkforactionrecognition
AT nicusebe temporalspikingrecurrentneuralnetworkforactionrecognition