Spatiotemporal saliency detection via sparse representation

Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which is essentially a method to identify areas in images and videos that capture the attention of the human visual system. In this paper, we propose a new spatiotemporal saliency framework for videos based...

Full description

Bibliographic Details
Main Authors:	Ren, Zhixiang, Gao, Shenghua, Rajan, Deepu, Chia, Clement Liang-Tien, Huang, Yun
Other Authors:	School of Computer Engineering
Format:	Conference Paper
Language:	English
Published:	2013
Subjects:	DRNTU::Engineering::Computer science and engineering
Online Access:	https://hdl.handle.net/10356/99619 http://hdl.handle.net/10220/13023

_version_	1824455914291200000
author	Ren, Zhixiang Gao, Shenghua Rajan, Deepu Chia, Clement Liang-Tien Huang, Yun
author2	School of Computer Engineering
author_facet	School of Computer Engineering Ren, Zhixiang Gao, Shenghua Rajan, Deepu Chia, Clement Liang-Tien Huang, Yun
author_sort	Ren, Zhixiang
collection	NTU
description	Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which is essentially a method to identify areas in images and videos that capture the attention of the human visual system. In this paper, we propose a new spatiotemporal saliency framework for videos based on sparse representation. For temporal saliency, we model the movement of the target patch as a reconstruction process, and the overlapping patches in neighboring frames are used to reconstruct the target patch. The learned coefficients encode the positions of the matched patches, which are able to represent the motion trajectory of the target patch. We also introduce a smoothing term into our sparse coding framework to learn coherent motion trajectories. Based on the psychological findings that abrupt stimulus could cause a rapid and involuntary deployment of attention, our temporal model combines the reconstruction error, sparsity regularizer, and local trajectory contrast to measure the motion saliency. For spatial saliency, a similar sparse reconstruction process is adopted to capture the regions with high center-surround contrast. Finally, the temporal saliency and spatial saliency are combined by agreement to favor the salient regions with high confidence. Experimental results on a human fixation video dataset show our method achieved the best performance over five state-of-the-art approaches.
first_indexed	2025-02-19T03:45:46Z
format	Conference Paper
id	ntu-10356/99619
institution	Nanyang Technological University
language	English
last_indexed	2025-02-19T03:45:46Z
publishDate	2013
record_format	dspace
spelling	ntu-10356/996192020-05-28T07:19:13Z Spatiotemporal saliency detection via sparse representation Ren, Zhixiang Gao, Shenghua Rajan, Deepu Chia, Clement Liang-Tien Huang, Yun School of Computer Engineering IEEE International Conference on Multimedia and Expo (2012 : Melbourne, Australia) DRNTU::Engineering::Computer science and engineering Multimedia applications like retrieval, copy detection etc. can gain from saliency detection, which is essentially a method to identify areas in images and videos that capture the attention of the human visual system. In this paper, we propose a new spatiotemporal saliency framework for videos based on sparse representation. For temporal saliency, we model the movement of the target patch as a reconstruction process, and the overlapping patches in neighboring frames are used to reconstruct the target patch. The learned coefficients encode the positions of the matched patches, which are able to represent the motion trajectory of the target patch. We also introduce a smoothing term into our sparse coding framework to learn coherent motion trajectories. Based on the psychological findings that abrupt stimulus could cause a rapid and involuntary deployment of attention, our temporal model combines the reconstruction error, sparsity regularizer, and local trajectory contrast to measure the motion saliency. For spatial saliency, a similar sparse reconstruction process is adopted to capture the regions with high center-surround contrast. Finally, the temporal saliency and spatial saliency are combined by agreement to favor the salient regions with high confidence. Experimental results on a human fixation video dataset show our method achieved the best performance over five state-of-the-art approaches. 2013-08-06T02:52:28Z 2019-12-06T20:09:34Z 2013-08-06T02:52:28Z 2019-12-06T20:09:34Z 2012 2012 Conference Paper https://hdl.handle.net/10356/99619 http://hdl.handle.net/10220/13023 10.1109/ICME.2012.173 en
spellingShingle	DRNTU::Engineering::Computer science and engineering Ren, Zhixiang Gao, Shenghua Rajan, Deepu Chia, Clement Liang-Tien Huang, Yun Spatiotemporal saliency detection via sparse representation
title	Spatiotemporal saliency detection via sparse representation
title_full	Spatiotemporal saliency detection via sparse representation
title_fullStr	Spatiotemporal saliency detection via sparse representation
title_full_unstemmed	Spatiotemporal saliency detection via sparse representation
title_short	Spatiotemporal saliency detection via sparse representation
title_sort	spatiotemporal saliency detection via sparse representation
topic	DRNTU::Engineering::Computer science and engineering
url	https://hdl.handle.net/10356/99619 http://hdl.handle.net/10220/13023
work_keys_str_mv	AT renzhixiang spatiotemporalsaliencydetectionviasparserepresentation AT gaoshenghua spatiotemporalsaliencydetectionviasparserepresentation AT rajandeepu spatiotemporalsaliencydetectionviasparserepresentation AT chiaclementliangtien spatiotemporalsaliencydetectionviasparserepresentation AT huangyun spatiotemporalsaliencydetectionviasparserepresentation

Spatiotemporal saliency detection via sparse representation

Similar Items