Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection
Existing RGB + depth (RGB-D) salient object detection methods mainly focus on better integrating the cross-modal features of RGB images and depth maps. Many methods use the same feature interaction module to fuse RGB and depth maps, which ignores the inherent properties of different modalities. In c...
Main Authors: | , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Hindawi Limited
2023-01-01
|
Series: | Advances in Multimedia |
Online Access: | http://dx.doi.org/10.1155/2023/9921988 |
_version_ | 1827000308319911936 |
---|---|
author | Lingbing Meng Mengya Yuan Xuehan Shi Qingqing Liu Le Zhange Jinhua Wu Ping Dai Fei Cheng |
author_facet | Lingbing Meng Mengya Yuan Xuehan Shi Qingqing Liu Le Zhange Jinhua Wu Ping Dai Fei Cheng |
author_sort | Lingbing Meng |
collection | DOAJ |
description | Existing RGB + depth (RGB-D) salient object detection methods mainly focus on better integrating the cross-modal features of RGB images and depth maps. Many methods use the same feature interaction module to fuse RGB and depth maps, which ignores the inherent properties of different modalities. In contrast to previous methods, this paper proposes a novel RGB-D salient object detection method that uses a depth-feature guide cross-modal fusion module based on the properties of RGB and depth maps. First, a depth-feature guide cross-modal fusion module is designed using coordinate attention to utilize the simple data representation capability of depth maps effectively. Second, a dense decoder guidance module is proposed to recover the spatial details of salient objects. Furthermore, a context-aware content module is proposed to extract rich context information, which can predict multiple objects more completely. Experimental results on six benchmark public datasets demonstrate that, compared with 15 mainstream convolutional neural network detection methods, the saliency map edge contours detected by the proposed model have better continuity and the spatial structure details are clearer. Perfect results are achieved on four quantitative evaluation metrics. Furthermore, the effectiveness of the three proposed modules is verified through ablation experiments. |
first_indexed | 2024-04-09T12:34:45Z |
format | Article |
id | doaj.art-d9b0eaa7ab3148c3bf6bf559d93e1b16 |
institution | Directory Open Access Journal |
issn | 1687-5699 |
language | English |
last_indexed | 2025-02-18T10:43:14Z |
publishDate | 2023-01-01 |
publisher | Hindawi Limited |
record_format | Article |
series | Advances in Multimedia |
spelling | doaj.art-d9b0eaa7ab3148c3bf6bf559d93e1b162024-11-02T05:28:00ZengHindawi LimitedAdvances in Multimedia1687-56992023-01-01202310.1155/2023/9921988Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object DetectionLingbing Meng0Mengya Yuan1Xuehan Shi2Qingqing Liu3Le Zhange4Jinhua Wu5Ping Dai6Fei Cheng7School of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologySchool of Anhui Institute of Information TechnologyExisting RGB + depth (RGB-D) salient object detection methods mainly focus on better integrating the cross-modal features of RGB images and depth maps. Many methods use the same feature interaction module to fuse RGB and depth maps, which ignores the inherent properties of different modalities. In contrast to previous methods, this paper proposes a novel RGB-D salient object detection method that uses a depth-feature guide cross-modal fusion module based on the properties of RGB and depth maps. First, a depth-feature guide cross-modal fusion module is designed using coordinate attention to utilize the simple data representation capability of depth maps effectively. Second, a dense decoder guidance module is proposed to recover the spatial details of salient objects. Furthermore, a context-aware content module is proposed to extract rich context information, which can predict multiple objects more completely. Experimental results on six benchmark public datasets demonstrate that, compared with 15 mainstream convolutional neural network detection methods, the saliency map edge contours detected by the proposed model have better continuity and the spatial structure details are clearer. Perfect results are achieved on four quantitative evaluation metrics. Furthermore, the effectiveness of the three proposed modules is verified through ablation experiments.http://dx.doi.org/10.1155/2023/9921988 |
spellingShingle | Lingbing Meng Mengya Yuan Xuehan Shi Qingqing Liu Le Zhange Jinhua Wu Ping Dai Fei Cheng Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection Advances in Multimedia |
title | Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection |
title_full | Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection |
title_fullStr | Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection |
title_full_unstemmed | Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection |
title_short | Coordinate Attention Filtering Depth-Feature Guide Cross-Modal Fusion RGB-Depth Salient Object Detection |
title_sort | coordinate attention filtering depth feature guide cross modal fusion rgb depth salient object detection |
url | http://dx.doi.org/10.1155/2023/9921988 |
work_keys_str_mv | AT lingbingmeng coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT mengyayuan coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT xuehanshi coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT qingqingliu coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT lezhange coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT jinhuawu coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT pingdai coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection AT feicheng coordinateattentionfilteringdepthfeatureguidecrossmodalfusionrgbdepthsalientobjectdetection |