Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features

Due to the black-box nature of deep networks, making explanations of their decision-making is extremely challenging. A solution is using post-hoc attention mechanisms with the deep network to verify the decision basis. However, those methods have problems such as gradient noise and false confidence....

Full description

Bibliographic Details
Main Authors: Yungi Ha, Chan-Hyun Youn
Format: Article
Language:English
Published: IEEE 2022-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9924199/
_version_ 1797989212666986496
author Yungi Ha
Chan-Hyun Youn
author_facet Yungi Ha
Chan-Hyun Youn
author_sort Yungi Ha
collection DOAJ
description Due to the black-box nature of deep networks, making explanations of their decision-making is extremely challenging. A solution is using post-hoc attention mechanisms with the deep network to verify the decision basis. However, those methods have problems such as gradient noise and false confidence. In addition, existing saliency methods either have limited performance by using only the last convolution layer or suffer from large computational overhead. In this work, we propose the Collection-CAM, which generates an attention map with low computational overhead while utilizing multi-level feature maps. First, the Collection-CAM searches for the most appropriate form of the partition through bottom-up clustering and clustering validation process. Then the Collection-CAM applies different pre-processing procedures on the shallow feature map and final feature map to overcome the false positiveness when applied without distinction. By combining collection-wise masks according to their contribution to the confidence score, the Collection-CAM completes the attention map generation process. Experimental results on ImageNet1k, UC Merced, and CUB dataset and various deep network models demonstrate that the Collection-CAM not only can synthesize a saliency map with a better visual explanation but also requires significantly lower computational overhead compared to those of region-based saliency methods.
first_indexed 2024-04-11T08:15:31Z
format Article
id doaj.art-0d33f9e43ada4dfb93df756afc335e48
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-11T08:15:31Z
publishDate 2022-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0d33f9e43ada4dfb93df756afc335e482022-12-22T04:35:09ZengIEEEIEEE Access2169-35362022-01-011011277611278810.1109/ACCESS.2022.32155349924199Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal FeaturesYungi Ha0https://orcid.org/0000-0001-6437-6988Chan-Hyun Youn1https://orcid.org/0000-0002-3970-7308School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaSchool of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaDue to the black-box nature of deep networks, making explanations of their decision-making is extremely challenging. A solution is using post-hoc attention mechanisms with the deep network to verify the decision basis. However, those methods have problems such as gradient noise and false confidence. In addition, existing saliency methods either have limited performance by using only the last convolution layer or suffer from large computational overhead. In this work, we propose the Collection-CAM, which generates an attention map with low computational overhead while utilizing multi-level feature maps. First, the Collection-CAM searches for the most appropriate form of the partition through bottom-up clustering and clustering validation process. Then the Collection-CAM applies different pre-processing procedures on the shallow feature map and final feature map to overcome the false positiveness when applied without distinction. By combining collection-wise masks according to their contribution to the confidence score, the Collection-CAM completes the attention map generation process. Experimental results on ImageNet1k, UC Merced, and CUB dataset and various deep network models demonstrate that the Collection-CAM not only can synthesize a saliency map with a better visual explanation but also requires significantly lower computational overhead compared to those of region-based saliency methods.https://ieeexplore.ieee.org/document/9924199/Visual explanationdeep learningaccelerationclustering analysis
spellingShingle Yungi Ha
Chan-Hyun Youn
Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
IEEE Access
Visual explanation
deep learning
acceleration
clustering analysis
title Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
title_full Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
title_fullStr Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
title_full_unstemmed Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
title_short Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
title_sort collection cam a faster region based saliency method using collection wise mask over pyramidal features
topic Visual explanation
deep learning
acceleration
clustering analysis
url https://ieeexplore.ieee.org/document/9924199/
work_keys_str_mv AT yungiha collectioncamafasterregionbasedsaliencymethodusingcollectionwisemaskoverpyramidalfeatures
AT chanhyunyoun collectioncamafasterregionbasedsaliencymethodusingcollectionwisemaskoverpyramidalfeatures