Leveraging attention‐based visual clue extraction for image classification

Abstract Deep learning‐based approaches have made considerable progress in image classification tasks, but most of the approaches lack interpretability, especially in revealing the decisive information causing the categorization of images. This paper seeks to answer the question of what clues encode...

Full description

Bibliographic Details
Main Authors:	Yunbo Cui, Youtian Du, Xue Wang, Hang Wang, Chang Su
Format:	Article
Language:	English
Published:	Wiley 2021-10-01
Series:	IET Image Processing
Subjects:	Image recognition Computer vision and image processing techniques Data mining Neural nets
Online Access:	https://doi.org/10.1049/ipr2.12280

_version_	1797991544976834560
author	Yunbo Cui Youtian Du Xue Wang Hang Wang Chang Su
author_facet	Yunbo Cui Youtian Du Xue Wang Hang Wang Chang Su
author_sort	Yunbo Cui
collection	DOAJ
description	Abstract Deep learning‐based approaches have made considerable progress in image classification tasks, but most of the approaches lack interpretability, especially in revealing the decisive information causing the categorization of images. This paper seeks to answer the question of what clues encode the discriminative visual information between image categories and can help improve the classification performance. To this end, an attention‐based clue extraction network (ACENet) is introduced to mine the decisive local visual information for image classification. ACENet constructs a clue‐attention mechanism, that is global‐local attention, between the image and visual clue proposals extracted from it and then introduces a contrastive loss defined over the achieved discrete attention distribution to increase the discriminability of clue proposals. The loss encourages considerable attention to be devoted to discriminative clue proposals, that is those similar within the same category and dissimilar across categories. The experimental results for the Negative Web Image (NWI) dataset and the public ImageNet2012 dataset demonstrate that ACENet can extract true clues to improve the image classification performance and outperforms the baselines and the state‐of‐the‐art methods.
first_indexed	2024-04-11T08:53:53Z
format	Article
id	doaj.art-62473bcffd9f4e419d7a2b11e3ebce78
institution	Directory Open Access Journal
issn	1751-9659 1751-9667
language	English
last_indexed	2024-04-11T08:53:53Z
publishDate	2021-10-01
publisher	Wiley
record_format	Article
series	IET Image Processing
spelling	doaj.art-62473bcffd9f4e419d7a2b11e3ebce782022-12-22T04:33:21ZengWileyIET Image Processing1751-96591751-96672021-10-0115122937294710.1049/ipr2.12280Leveraging attention‐based visual clue extraction for image classificationYunbo Cui0Youtian Du1Xue Wang2Hang Wang3Chang Su4Faculty of Electronic and Information Engineering Xi'an Jiaotong University No. 28, Xianning West Road Xi'an ChinaFaculty of Electronic and Information Engineering Xi'an Jiaotong University No. 28, Xianning West Road Xi'an ChinaFaculty of Electronic and Information Engineering Xi'an Jiaotong University No. 28, Xianning West Road Xi'an ChinaFaculty of Electronic and Information Engineering Xi'an Jiaotong University No. 28, Xianning West Road Xi'an ChinaDepartment of Healthcare Policy and Research at Weill Cornell Medicine Cornell University Ithaca, New York USAAbstract Deep learning‐based approaches have made considerable progress in image classification tasks, but most of the approaches lack interpretability, especially in revealing the decisive information causing the categorization of images. This paper seeks to answer the question of what clues encode the discriminative visual information between image categories and can help improve the classification performance. To this end, an attention‐based clue extraction network (ACENet) is introduced to mine the decisive local visual information for image classification. ACENet constructs a clue‐attention mechanism, that is global‐local attention, between the image and visual clue proposals extracted from it and then introduces a contrastive loss defined over the achieved discrete attention distribution to increase the discriminability of clue proposals. The loss encourages considerable attention to be devoted to discriminative clue proposals, that is those similar within the same category and dissimilar across categories. The experimental results for the Negative Web Image (NWI) dataset and the public ImageNet2012 dataset demonstrate that ACENet can extract true clues to improve the image classification performance and outperforms the baselines and the state‐of‐the‐art methods.https://doi.org/10.1049/ipr2.12280Image recognitionComputer vision and image processing techniquesData miningNeural nets
spellingShingle	Yunbo Cui Youtian Du Xue Wang Hang Wang Chang Su Leveraging attention‐based visual clue extraction for image classification IET Image Processing Image recognition Computer vision and image processing techniques Data mining Neural nets
title	Leveraging attention‐based visual clue extraction for image classification
title_full	Leveraging attention‐based visual clue extraction for image classification
title_fullStr	Leveraging attention‐based visual clue extraction for image classification
title_full_unstemmed	Leveraging attention‐based visual clue extraction for image classification
title_short	Leveraging attention‐based visual clue extraction for image classification
title_sort	leveraging attention based visual clue extraction for image classification
topic	Image recognition Computer vision and image processing techniques Data mining Neural nets
url	https://doi.org/10.1049/ipr2.12280
work_keys_str_mv	AT yunbocui leveragingattentionbasedvisualclueextractionforimageclassification AT youtiandu leveragingattentionbasedvisualclueextractionforimageclassification AT xuewang leveragingattentionbasedvisualclueextractionforimageclassification AT hangwang leveragingattentionbasedvisualclueextractionforimageclassification AT changsu leveragingattentionbasedvisualclueextractionforimageclassification

Leveraging attention‐based visual clue extraction for image classification

Similar Items