Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network

Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end v...

Full description

Bibliographic Details
Main Authors: Zhenhao Sun, Xu Wang, Qiudan Zhang, Jianmin Jiang
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8863376/
_version_ 1818910684903112704
author Zhenhao Sun
Xu Wang
Qiudan Zhang
Jianmin Jiang
author_facet Zhenhao Sun
Xu Wang
Qiudan Zhang
Jianmin Jiang
author_sort Zhenhao Sun
collection DOAJ
description Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.
first_indexed 2024-12-19T22:46:44Z
format Article
id doaj.art-0bb9f39c5e424c80a9e1cbcdbaf99b77
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-12-19T22:46:44Z
publishDate 2019-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0bb9f39c5e424c80a9e1cbcdbaf99b772022-12-21T20:02:56ZengIEEEIEEE Access2169-35362019-01-01714774314775410.1109/ACCESS.2019.29464798863376Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural NetworkZhenhao Sun0Xu Wang1https://orcid.org/0000-0002-2948-6468Qiudan Zhang2Jianmin Jiang3https://orcid.org/0000-0002-7576-3999College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaCollege of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaDepartment of Computer Science, City University of Hong Kong, Hong KongCollege of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaAttention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.https://ieeexplore.ieee.org/document/8863376/Video saliency predictioneye fixation dataset3D residual convolutional neural network
spellingShingle Zhenhao Sun
Xu Wang
Qiudan Zhang
Jianmin Jiang
Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
IEEE Access
Video saliency prediction
eye fixation dataset
3D residual convolutional neural network
title Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_full Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_fullStr Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_full_unstemmed Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_short Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_sort real time video saliency prediction via 3d residual convolutional neural network
topic Video saliency prediction
eye fixation dataset
3D residual convolutional neural network
url https://ieeexplore.ieee.org/document/8863376/
work_keys_str_mv AT zhenhaosun realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork
AT xuwang realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork
AT qiudanzhang realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork
AT jianminjiang realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork