Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network

Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end v...

Full description

Bibliographic Details
Main Authors:	Zhenhao Sun, Xu Wang, Qiudan Zhang, Jianmin Jiang
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Video saliency prediction eye fixation dataset 3D residual convolutional neural network
Online Access:	https://ieeexplore.ieee.org/document/8863376/

_version_	1818910684903112704
author	Zhenhao Sun Xu Wang Qiudan Zhang Jianmin Jiang
author_facet	Zhenhao Sun Xu Wang Qiudan Zhang Jianmin Jiang
author_sort	Zhenhao Sun
collection	DOAJ
description	Attention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.
first_indexed	2024-12-19T22:46:44Z
format	Article
id	doaj.art-0bb9f39c5e424c80a9e1cbcdbaf99b77
institution	Directory Open Access Journal
issn	2169-3536
language	English
last_indexed	2024-12-19T22:46:44Z
publishDate	2019-01-01
publisher	IEEE
record_format	Article
series	IEEE Access
spelling	doaj.art-0bb9f39c5e424c80a9e1cbcdbaf99b772022-12-21T20:02:56ZengIEEEIEEE Access2169-35362019-01-01714774314775410.1109/ACCESS.2019.29464798863376Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural NetworkZhenhao Sun0Xu Wang1https://orcid.org/0000-0002-2948-6468Qiudan Zhang2Jianmin Jiang3https://orcid.org/0000-0002-7576-3999College of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaCollege of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaDepartment of Computer Science, City University of Hong Kong, Hong KongCollege of Computer Science and Software Engineering, Shenzhen University, Shenzhen, ChinaAttention is a fundamental attribute of human visual system that plays important roles in many visual perception tasks. The key issue of video saliency lies in how to efficiently exploit the temporal information. Instead of singling out the temporal saliency maps, we propose a real-time end-to-end video saliency prediction model via 3D residual convolutional neural network (3D-ResNet), which incorporates the prediction of spatial and temporal saliency maps into one single process. In particular, a multi-scale feature representation scheme is employed to further boost the model performance. Besides, a frame skipping strategy is proposed for speeding up the saliency map inference process. Moreover, a new challenging eye tracking database with 220 video clips is established to facilitate the research of video saliency prediction. Extensive experimental results show our model outperforms the state-of-the-art methods over the eye fixation datasets in terms of both prediction accuracy and inference speed.https://ieeexplore.ieee.org/document/8863376/Video saliency predictioneye fixation dataset3D residual convolutional neural network
spellingShingle	Zhenhao Sun Xu Wang Qiudan Zhang Jianmin Jiang Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network IEEE Access Video saliency prediction eye fixation dataset 3D residual convolutional neural network
title	Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_full	Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_fullStr	Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_full_unstemmed	Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_short	Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network
title_sort	real time video saliency prediction via 3d residual convolutional neural network
topic	Video saliency prediction eye fixation dataset 3D residual convolutional neural network
url	https://ieeexplore.ieee.org/document/8863376/
work_keys_str_mv	AT zhenhaosun realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork AT xuwang realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork AT qiudanzhang realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork AT jianminjiang realtimevideosaliencypredictionvia3dresidualconvolutionalneuralnetwork

Real-Time Video Saliency Prediction Via 3D Residual Convolutional Neural Network

Similar Items