Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection

Video salient object detection has attracted growing interest in recent years. However, some existing video saliency models often suffer from the inappropriate utilization of spatial and temporal cues and the insufficient aggregation of different level features, leading to remarkable performance deg...

Full description

Bibliographic Details
Main Authors:	Xiaofei Zhou, Hanxiao Gao, Longxuan Yu, Defu Yang, Jiyong Zhang
Format:	Article
Language:	English
Published:	MDPI AG 2023-01-01
Series:	Electronics
Subjects:	video salient object detection quality score feature fusion dual-branch
Online Access:	https://www.mdpi.com/2079-9292/12/3/680

_version_	1797624749736591360
author	Xiaofei Zhou Hanxiao Gao Longxuan Yu Defu Yang Jiyong Zhang
author_facet	Xiaofei Zhou Hanxiao Gao Longxuan Yu Defu Yang Jiyong Zhang
author_sort	Xiaofei Zhou
collection	DOAJ
description	Video salient object detection has attracted growing interest in recent years. However, some existing video saliency models often suffer from the inappropriate utilization of spatial and temporal cues and the insufficient aggregation of different level features, leading to remarkable performance degradation. Therefore, we propose a quality-driven dual-branch feature integration network majoring in the adaptive fusion of multi-modal cues and sufficient aggregation of multi-level spatiotemporal features. Firstly, we employ the quality-driven multi-modal feature fusion (QMFF) module to combine the spatial and temporal features. Particularly, the quality scores estimated from each level’s spatial and temporal cues are not only used to weigh the two modal features but also to adaptively integrate the coarse spatial and temporal saliency predictions into the guidance map, which further enhances the two modal features. Secondly, we deploy the dual-branch-based multi-level feature aggregation (DMFA) module to integrate multi-level spatiotemporal features, where the two branches including the progressive decoder branch and the direct concatenation branch sufficiently explore the cooperation of multi-level spatiotemporal features. In particular, in order to provide an adaptive fusion for the outputs of the two branches, we design the dual-branch fusion (DF) unit, where the channel weight of each output can be learned jointly from the two outputs. The experiments conducted on four video datasets clearly demonstrate the effectiveness and superiority of our model against the state-of-the-art video saliency models.
first_indexed	2024-03-11T09:47:29Z
format	Article
id	doaj.art-6ab9c233cbb447b18832fe089c48fab9
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-03-11T09:47:29Z
publishDate	2023-01-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-6ab9c233cbb447b18832fe089c48fab92023-11-16T16:29:56ZengMDPI AGElectronics2079-92922023-01-0112368010.3390/electronics12030680Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object DetectionXiaofei Zhou0Hanxiao Gao1Longxuan Yu2Defu Yang3Jiyong Zhang4School of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaVideo salient object detection has attracted growing interest in recent years. However, some existing video saliency models often suffer from the inappropriate utilization of spatial and temporal cues and the insufficient aggregation of different level features, leading to remarkable performance degradation. Therefore, we propose a quality-driven dual-branch feature integration network majoring in the adaptive fusion of multi-modal cues and sufficient aggregation of multi-level spatiotemporal features. Firstly, we employ the quality-driven multi-modal feature fusion (QMFF) module to combine the spatial and temporal features. Particularly, the quality scores estimated from each level’s spatial and temporal cues are not only used to weigh the two modal features but also to adaptively integrate the coarse spatial and temporal saliency predictions into the guidance map, which further enhances the two modal features. Secondly, we deploy the dual-branch-based multi-level feature aggregation (DMFA) module to integrate multi-level spatiotemporal features, where the two branches including the progressive decoder branch and the direct concatenation branch sufficiently explore the cooperation of multi-level spatiotemporal features. In particular, in order to provide an adaptive fusion for the outputs of the two branches, we design the dual-branch fusion (DF) unit, where the channel weight of each output can be learned jointly from the two outputs. The experiments conducted on four video datasets clearly demonstrate the effectiveness and superiority of our model against the state-of-the-art video saliency models.https://www.mdpi.com/2079-9292/12/3/680video salient object detectionquality scorefeature fusiondual-branch
spellingShingle	Xiaofei Zhou Hanxiao Gao Longxuan Yu Defu Yang Jiyong Zhang Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection Electronics video salient object detection quality score feature fusion dual-branch
title	Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_full	Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_fullStr	Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_full_unstemmed	Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_short	Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_sort	quality driven dual branch feature integration network for video salient object detection
topic	video salient object detection quality score feature fusion dual-branch
url	https://www.mdpi.com/2079-9292/12/3/680
work_keys_str_mv	AT xiaofeizhou qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection AT hanxiaogao qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection AT longxuanyu qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection AT defuyang qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection AT jiyongzhang qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection

Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection

Similar Items