Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features
Due to the black-box nature of deep networks, making explanations of their decision-making is extremely challenging. A solution is using post-hoc attention mechanisms with the deep network to verify the decision basis. However, those methods have problems such as gradient noise and false confidence....
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2022-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9924199/ |
_version_ | 1797989212666986496 |
---|---|
author | Yungi Ha Chan-Hyun Youn |
author_facet | Yungi Ha Chan-Hyun Youn |
author_sort | Yungi Ha |
collection | DOAJ |
description | Due to the black-box nature of deep networks, making explanations of their decision-making is extremely challenging. A solution is using post-hoc attention mechanisms with the deep network to verify the decision basis. However, those methods have problems such as gradient noise and false confidence. In addition, existing saliency methods either have limited performance by using only the last convolution layer or suffer from large computational overhead. In this work, we propose the Collection-CAM, which generates an attention map with low computational overhead while utilizing multi-level feature maps. First, the Collection-CAM searches for the most appropriate form of the partition through bottom-up clustering and clustering validation process. Then the Collection-CAM applies different pre-processing procedures on the shallow feature map and final feature map to overcome the false positiveness when applied without distinction. By combining collection-wise masks according to their contribution to the confidence score, the Collection-CAM completes the attention map generation process. Experimental results on ImageNet1k, UC Merced, and CUB dataset and various deep network models demonstrate that the Collection-CAM not only can synthesize a saliency map with a better visual explanation but also requires significantly lower computational overhead compared to those of region-based saliency methods. |
first_indexed | 2024-04-11T08:15:31Z |
format | Article |
id | doaj.art-0d33f9e43ada4dfb93df756afc335e48 |
institution | Directory Open Access Journal |
issn | 2169-3536 |
language | English |
last_indexed | 2024-04-11T08:15:31Z |
publishDate | 2022-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Access |
spelling | doaj.art-0d33f9e43ada4dfb93df756afc335e482022-12-22T04:35:09ZengIEEEIEEE Access2169-35362022-01-011011277611278810.1109/ACCESS.2022.32155349924199Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal FeaturesYungi Ha0https://orcid.org/0000-0001-6437-6988Chan-Hyun Youn1https://orcid.org/0000-0002-3970-7308School of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaSchool of Electrical Engineering, Korea Advanced Institute of Science and Technology, Daejeon, South KoreaDue to the black-box nature of deep networks, making explanations of their decision-making is extremely challenging. A solution is using post-hoc attention mechanisms with the deep network to verify the decision basis. However, those methods have problems such as gradient noise and false confidence. In addition, existing saliency methods either have limited performance by using only the last convolution layer or suffer from large computational overhead. In this work, we propose the Collection-CAM, which generates an attention map with low computational overhead while utilizing multi-level feature maps. First, the Collection-CAM searches for the most appropriate form of the partition through bottom-up clustering and clustering validation process. Then the Collection-CAM applies different pre-processing procedures on the shallow feature map and final feature map to overcome the false positiveness when applied without distinction. By combining collection-wise masks according to their contribution to the confidence score, the Collection-CAM completes the attention map generation process. Experimental results on ImageNet1k, UC Merced, and CUB dataset and various deep network models demonstrate that the Collection-CAM not only can synthesize a saliency map with a better visual explanation but also requires significantly lower computational overhead compared to those of region-based saliency methods.https://ieeexplore.ieee.org/document/9924199/Visual explanationdeep learningaccelerationclustering analysis |
spellingShingle | Yungi Ha Chan-Hyun Youn Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features IEEE Access Visual explanation deep learning acceleration clustering analysis |
title | Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features |
title_full | Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features |
title_fullStr | Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features |
title_full_unstemmed | Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features |
title_short | Collection-CAM: A Faster Region-Based Saliency Method Using Collection-Wise Mask Over Pyramidal Features |
title_sort | collection cam a faster region based saliency method using collection wise mask over pyramidal features |
topic | Visual explanation deep learning acceleration clustering analysis |
url | https://ieeexplore.ieee.org/document/9924199/ |
work_keys_str_mv | AT yungiha collectioncamafasterregionbasedsaliencymethodusingcollectionwisemaskoverpyramidalfeatures AT chanhyunyoun collectioncamafasterregionbasedsaliencymethodusingcollectionwisemaskoverpyramidalfeatures |