Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection

Existing RGB + depth (RGB-D) salient object detection methods mainly focus on better integrating the cross-modal features of RGB images and depth maps. Many methods use the same feature interaction module to fuse RGB and depth maps, which ignores the inherent properties of different modalities. In c...

Full description

Bibliographic Details
Main Authors:	Lingbing Meng, Mengya Yuan, Xuehan Shi, Qingqing Liu, Le Zhange, Jinhua Wu, Ping Dai, Fei Cheng
Format:	Article
Language:	English
Published:	Hindawi Limited 2023-01-01
Series:	Advances in Multimedia
Online Access:	http://dx.doi.org/10.1155/2023/9921988

_version_	1827000308319911936
author	Lingbing Meng Mengya Yuan Xuehan Shi Qingqing Liu Le Zhange Jinhua Wu Ping Dai Fei Cheng
author_facet	Lingbing Meng Mengya Yuan Xuehan Shi Qingqing Liu Le Zhange Jinhua Wu Ping Dai Fei Cheng
author_sort	Lingbing Meng
collection	DOAJ
description	Existing RGB + depth (RGB-D) salient object detection methods mainly focus on better integrating the cross-modal features of RGB images and depth maps. Many methods use the same feature interaction module to fuse RGB and depth maps, which ignores the inherent properties of different modalities. In contrast to previous methods, this paper proposes a novel RGB-D salient object detection method that uses a depth-feature guide cross-modal fusion module based on the properties of RGB and depth maps. First, a depth-feature guide cross-modal fusion module is designed using coordinate attention to utilize the simple data representation capability of depth maps effectively. Second, a dense decoder guidance module is proposed to recover the spatial details of salient objects. Furthermore, a context-aware content module is proposed to extract rich context information, which can predict multiple objects more completely. Experimental results on six benchmark public datasets demonstrate that, compared with 15 mainstream convolutional neural network detection methods, the saliency map edge contours detected by the proposed model have better continuity and the spatial structure details are clearer. Perfect results are achieved on four quantitative evaluation metrics. Furthermore, the effectiveness of the three proposed modules is verified through ablation experiments.
first_indexed	2024-04-09T12:34:45Z
format	Article
id	doaj.art-d9b0eaa7ab3148c3bf6bf559d93e1b16
institution	Directory Open Access Journal
issn	1687-5699
language	English
last_indexed	2025-02-18T10:43:14Z
publishDate	2023-01-01
publisher	Hindawi Limited
record_format	Article
series	Advances in Multimedia
spelling	doaj.art-d9b0eaa7ab3148c3bf6bf559d93e1b162024-11-02T05:28:00ZengHindawi LimitedAdvances in Multimedia1687-56992023-01-01202310.1155/2023/9921988Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object DetectionLingbing Meng0Mengya Yuan1Xuehan Shi2Qingqing Liu3Le Zhange4Jinhua Wu5Ping Dai6Fei Cheng7School of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologyExisting RGB + depth (RGB-D) salient object detection methods mainly focus on better integrating the cross-modal features of RGB images and depth maps. Many methods use the same feature interaction module to fuse RGB and depth maps, which ignores the inherent properties of different modalities. In contrast to previous methods, this paper proposes a novel RGB-D salient object detection method that uses a depth-feature guide cross-modal fusion module based on the properties of RGB and depth maps. First, a depth-feature guide cross-modal fusion module is designed using coordinate attention to utilize the simple data representation capability of depth maps effectively. Second, a dense decoder guidance module is proposed to recover the spatial details of salient objects. Furthermore, a context-aware content module is proposed to extract rich context information, which can predict multiple objects more completely. Experimental results on six benchmark public datasets demonstrate that, compared with 15 mainstream convolutional neural network detection methods, the saliency map edge contours detected by the proposed model have better continuity and the spatial structure details are clearer. Perfect results are achieved on four quantitative evaluation metrics. Furthermore, the effectiveness of the three proposed modules is verified through ablation experiments.http://dx.doi.org/10.1155/2023/9921988
spellingShingle	Lingbing Meng Mengya Yuan Xuehan Shi Qingqing Liu Le Zhange Jinhua Wu Ping Dai Fei Cheng Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection Advances in Multimedia
title	Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection
title_full	Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection
title_fullStr	Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection
title_full_unstemmed	Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection
title_short	Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection
title_sort	coordinate attention filtering depth feature guide cross modal fusion rgb depth salient object detection
url	http://dx.doi.org/10.1155/2023/9921988
work_keys_str_mv	AT lingbingmeng coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT mengyayuan coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT xuehanshi coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT qingqingliu coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT lezhange coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT jinhuawu coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT pingdai coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT feicheng coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection

Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection

Similar Items