CAT: Learning to collaborate channel and spatial attention from multi‐information fusion
Abstract Channel and spatial attention mechanisms have proven to provide an evident performance boost of deep convolution neural networks. Most existing methods focus on one or run them parallel (series), neglecting the collaboration between the two attentions. In order to better establish the featu...
Main Authors: | , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-04-01
|
Series: | IET Computer Vision |
Subjects: | |
Online Access: | https://doi.org/10.1049/cvi2.12166 |
_version_ | 1797846308425302016 |
---|---|
author | Zizhang Wu Man Wang Weiwei Sun Yuchen Li Tianhao Xu Fan Wang Keke Huang |
author_facet | Zizhang Wu Man Wang Weiwei Sun Yuchen Li Tianhao Xu Fan Wang Keke Huang |
author_sort | Zizhang Wu |
collection | DOAJ |
description | Abstract Channel and spatial attention mechanisms have proven to provide an evident performance boost of deep convolution neural networks. Most existing methods focus on one or run them parallel (series), neglecting the collaboration between the two attentions. In order to better establish the feature interaction between the two types of attentions, a plug‐and‐play attention module is proposed, which is termed as ‘CAT’—activating the Collaboration between spatial and channel Attentions based on learned Traits. Specifically, traits are represented as trainable coefficients (i.e. colla‐factors) to adaptively combine contributions of different attention modules to fit different image hierarchies and tasks better. Moreover, the global entropy pooling is proposed apart from global average pooling and global maximum pooling (GMP) operators, which is an effective component in suppressing noise signals by measuring the information disorder of feature maps. A three‐way pooling operation is introduced into attention modules and the adaptive mechanism is applied to fuse their outcomes. Extensive experiments on MS COCO, Pascal‐VOC, Cifar‐100, and ImageNet show that our CAT outperforms the existing state‐of‐the‐art attention mechanisms in object detection, instance segmentation, and image classification. The model and code will be released soon. |
first_indexed | 2024-04-09T17:52:51Z |
format | Article |
id | doaj.art-b41cca5fbbbc4baca7e77c0141cf8d59 |
institution | Directory Open Access Journal |
issn | 1751-9632 1751-9640 |
language | English |
last_indexed | 2024-04-09T17:52:51Z |
publishDate | 2023-04-01 |
publisher | Wiley |
record_format | Article |
series | IET Computer Vision |
spelling | doaj.art-b41cca5fbbbc4baca7e77c0141cf8d592023-04-15T11:16:51ZengWileyIET Computer Vision1751-96321751-96402023-04-0117330931810.1049/cvi2.12166CAT: Learning to collaborate channel and spatial attention from multi‐information fusionZizhang Wu0Man Wang1Weiwei Sun2Yuchen Li3Tianhao Xu4Fan Wang5Keke Huang6Zongmu Technology Shanghai ChinaZongmu Technology Shanghai ChinaZongmu Technology Shanghai ChinaZongmu Technology Shanghai ChinaZongmu Technology Shanghai ChinaZongmu Technology Shanghai ChinaCentral South University Changsha ChinaAbstract Channel and spatial attention mechanisms have proven to provide an evident performance boost of deep convolution neural networks. Most existing methods focus on one or run them parallel (series), neglecting the collaboration between the two attentions. In order to better establish the feature interaction between the two types of attentions, a plug‐and‐play attention module is proposed, which is termed as ‘CAT’—activating the Collaboration between spatial and channel Attentions based on learned Traits. Specifically, traits are represented as trainable coefficients (i.e. colla‐factors) to adaptively combine contributions of different attention modules to fit different image hierarchies and tasks better. Moreover, the global entropy pooling is proposed apart from global average pooling and global maximum pooling (GMP) operators, which is an effective component in suppressing noise signals by measuring the information disorder of feature maps. A three‐way pooling operation is introduced into attention modules and the adaptive mechanism is applied to fuse their outcomes. Extensive experiments on MS COCO, Pascal‐VOC, Cifar‐100, and ImageNet show that our CAT outperforms the existing state‐of‐the‐art attention mechanisms in object detection, instance segmentation, and image classification. The model and code will be released soon.https://doi.org/10.1049/cvi2.12166channel attentiondynamic learningentropy poolingspatial attention |
spellingShingle | Zizhang Wu Man Wang Weiwei Sun Yuchen Li Tianhao Xu Fan Wang Keke Huang CAT: Learning to collaborate channel and spatial attention from multi‐information fusion IET Computer Vision channel attention dynamic learning entropy pooling spatial attention |
title | CAT: Learning to collaborate channel and spatial attention from multi‐information fusion |
title_full | CAT: Learning to collaborate channel and spatial attention from multi‐information fusion |
title_fullStr | CAT: Learning to collaborate channel and spatial attention from multi‐information fusion |
title_full_unstemmed | CAT: Learning to collaborate channel and spatial attention from multi‐information fusion |
title_short | CAT: Learning to collaborate channel and spatial attention from multi‐information fusion |
title_sort | cat learning to collaborate channel and spatial attention from multi information fusion |
topic | channel attention dynamic learning entropy pooling spatial attention |
url | https://doi.org/10.1049/cvi2.12166 |
work_keys_str_mv | AT zizhangwu catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion AT manwang catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion AT weiweisun catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion AT yuchenli catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion AT tianhaoxu catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion AT fanwang catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion AT kekehuang catlearningtocollaboratechannelandspatialattentionfrommultiinformationfusion |