Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection

Video salient object detection has attracted growing interest in recent years. However, some existing video saliency models often suffer from the inappropriate utilization of spatial and temporal cues and the insufficient aggregation of different level features, leading to remarkable performance deg...

Full description

Bibliographic Details
Main Authors: Xiaofei Zhou, Hanxiao Gao, Longxuan Yu, Defu Yang, Jiyong Zhang
Format: Article
Language:English
Published: MDPI AG 2023-01-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/12/3/680
_version_ 1797624749736591360
author Xiaofei Zhou
Hanxiao Gao
Longxuan Yu
Defu Yang
Jiyong Zhang
author_facet Xiaofei Zhou
Hanxiao Gao
Longxuan Yu
Defu Yang
Jiyong Zhang
author_sort Xiaofei Zhou
collection DOAJ
description Video salient object detection has attracted growing interest in recent years. However, some existing video saliency models often suffer from the inappropriate utilization of spatial and temporal cues and the insufficient aggregation of different level features, leading to remarkable performance degradation. Therefore, we propose a quality-driven dual-branch feature integration network majoring in the adaptive fusion of multi-modal cues and sufficient aggregation of multi-level spatiotemporal features. Firstly, we employ the quality-driven multi-modal feature fusion (QMFF) module to combine the spatial and temporal features. Particularly, the quality scores estimated from each level’s spatial and temporal cues are not only used to weigh the two modal features but also to adaptively integrate the coarse spatial and temporal saliency predictions into the guidance map, which further enhances the two modal features. Secondly, we deploy the dual-branch-based multi-level feature aggregation (DMFA) module to integrate multi-level spatiotemporal features, where the two branches including the progressive decoder branch and the direct concatenation branch sufficiently explore the cooperation of multi-level spatiotemporal features. In particular, in order to provide an adaptive fusion for the outputs of the two branches, we design the dual-branch fusion (DF) unit, where the channel weight of each output can be learned jointly from the two outputs. The experiments conducted on four video datasets clearly demonstrate the effectiveness and superiority of our model against the state-of-the-art video saliency models.
first_indexed 2024-03-11T09:47:29Z
format Article
id doaj.art-6ab9c233cbb447b18832fe089c48fab9
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-03-11T09:47:29Z
publishDate 2023-01-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-6ab9c233cbb447b18832fe089c48fab92023-11-16T16:29:56ZengMDPI AGElectronics2079-92922023-01-0112368010.3390/electronics12030680Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object DetectionXiaofei Zhou0Hanxiao Gao1Longxuan Yu2Defu Yang3Jiyong Zhang4School of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaSchool of Automation, Hangzhou Dianzi University, Hangzhou 310018, ChinaVideo salient object detection has attracted growing interest in recent years. However, some existing video saliency models often suffer from the inappropriate utilization of spatial and temporal cues and the insufficient aggregation of different level features, leading to remarkable performance degradation. Therefore, we propose a quality-driven dual-branch feature integration network majoring in the adaptive fusion of multi-modal cues and sufficient aggregation of multi-level spatiotemporal features. Firstly, we employ the quality-driven multi-modal feature fusion (QMFF) module to combine the spatial and temporal features. Particularly, the quality scores estimated from each level’s spatial and temporal cues are not only used to weigh the two modal features but also to adaptively integrate the coarse spatial and temporal saliency predictions into the guidance map, which further enhances the two modal features. Secondly, we deploy the dual-branch-based multi-level feature aggregation (DMFA) module to integrate multi-level spatiotemporal features, where the two branches including the progressive decoder branch and the direct concatenation branch sufficiently explore the cooperation of multi-level spatiotemporal features. In particular, in order to provide an adaptive fusion for the outputs of the two branches, we design the dual-branch fusion (DF) unit, where the channel weight of each output can be learned jointly from the two outputs. The experiments conducted on four video datasets clearly demonstrate the effectiveness and superiority of our model against the state-of-the-art video saliency models.https://www.mdpi.com/2079-9292/12/3/680video salient object detectionquality scorefeature fusiondual-branch
spellingShingle Xiaofei Zhou
Hanxiao Gao
Longxuan Yu
Defu Yang
Jiyong Zhang
Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
Electronics
video salient object detection
quality score
feature fusion
dual-branch
title Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_full Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_fullStr Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_full_unstemmed Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_short Quality-Driven Dual-Branch Feature Integration Network for Video Salient Object Detection
title_sort quality driven dual branch feature integration network for video salient object detection
topic video salient object detection
quality score
feature fusion
dual-branch
url https://www.mdpi.com/2079-9292/12/3/680
work_keys_str_mv AT xiaofeizhou qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection
AT hanxiaogao qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection
AT longxuanyu qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection
AT defuyang qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection
AT jiyongzhang qualitydrivendualbranchfeatureintegrationnetworkforvideosalientobjectdetection