Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification

In recent years, computer vision has witnessed remarkable advancements in image classification, specifically in the domains of fully convolutional neural networks (FCNs) and self-attention mechanisms. Nevertheless, both approaches exhibit certain limitations. FCNs tend to prioritize local informatio...

Full description

Bibliographic Details
Main Authors:	Jiahang Lyu, Rongxin Zou, Qin Wan, Wang Xi, Qinglin Yang, Sarath Kodagoda, Shifeng Wang
Format:	Article
Language:	English
Published:	MDPI AG 2024-03-01
Series:	Sensors
Subjects:	image classification computer vision self-attention mechanism CNN
Online Access:	https://www.mdpi.com/1424-8220/24/7/2055

_version_	1827286560806010880
author	Jiahang Lyu Rongxin Zou Qin Wan Wang Xi Qinglin Yang Sarath Kodagoda Shifeng Wang
author_facet	Jiahang Lyu Rongxin Zou Qin Wan Wang Xi Qinglin Yang Sarath Kodagoda Shifeng Wang
author_sort	Jiahang Lyu
collection	DOAJ
description	In recent years, computer vision has witnessed remarkable advancements in image classification, specifically in the domains of fully convolutional neural networks (FCNs) and self-attention mechanisms. Nevertheless, both approaches exhibit certain limitations. FCNs tend to prioritize local information, potentially overlooking crucial global contexts, whereas self-attention mechanisms are computationally intensive despite their adaptability. In order to surmount these challenges, this paper proposes cross-and-diagonal networks (CDNet), innovative network architecture that adeptly captures global information in images while preserving local details in a more computationally efficient manner. CDNet achieves this by establishing long-range relationships between pixels within an image, enabling the indirect acquisition of contextual information. This inventive indirect self-attention mechanism significantly enhances the network’s capacity. In CDNet, a new attention mechanism named “cross and diagonal attention” is proposed. This mechanism adopts an indirect approach by integrating two distinct components, cross attention and diagonal attention. By computing attention in different directions, specifically vertical and diagonal, CDNet effectively establishes remote dependencies among pixels, resulting in improved performance in image classification tasks. Experimental results highlight several advantages of CDNet. Firstly, it introduces an indirect self-attention mechanism that can be effortlessly integrated as a module into any convolutional neural network (CNN). Additionally, the computational cost of the self-attention mechanism has been effectively reduced, resulting in improved overall computational efficiency. Lastly, CDNet attains state-of-the-art performance on three benchmark datasets for similar types of image classification networks. In essence, CDNet addresses the constraints of conventional approaches and provides an efficient and effective solution for capturing global context in image classification tasks.
first_indexed	2024-04-24T10:35:30Z
format	Article
id	doaj.art-4a9badc1866e4ff39abe0a22bb4ed4a0
institution	Directory Open Access Journal
issn	1424-8220
language	English
last_indexed	2024-04-24T10:35:30Z
publishDate	2024-03-01
publisher	MDPI AG
record_format	Article
series	Sensors
spelling	doaj.art-4a9badc1866e4ff39abe0a22bb4ed4a02024-04-12T13:26:06ZengMDPI AGSensors1424-82202024-03-01247205510.3390/s24072055Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image ClassificationJiahang Lyu0Rongxin Zou1Qin Wan2Wang Xi3Qinglin Yang4Sarath Kodagoda5Shifeng Wang6School of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaFaculty of Engineering & Information Technology, University of Technology Sydney, Sydney, NWS 2007, AustraliaSchool of Optoelectronic Engineering, Changchun University of Science and Technology, Changchun 130022, ChinaIn recent years, computer vision has witnessed remarkable advancements in image classification, specifically in the domains of fully convolutional neural networks (FCNs) and self-attention mechanisms. Nevertheless, both approaches exhibit certain limitations. FCNs tend to prioritize local information, potentially overlooking crucial global contexts, whereas self-attention mechanisms are computationally intensive despite their adaptability. In order to surmount these challenges, this paper proposes cross-and-diagonal networks (CDNet), innovative network architecture that adeptly captures global information in images while preserving local details in a more computationally efficient manner. CDNet achieves this by establishing long-range relationships between pixels within an image, enabling the indirect acquisition of contextual information. This inventive indirect self-attention mechanism significantly enhances the network’s capacity. In CDNet, a new attention mechanism named “cross and diagonal attention” is proposed. This mechanism adopts an indirect approach by integrating two distinct components, cross attention and diagonal attention. By computing attention in different directions, specifically vertical and diagonal, CDNet effectively establishes remote dependencies among pixels, resulting in improved performance in image classification tasks. Experimental results highlight several advantages of CDNet. Firstly, it introduces an indirect self-attention mechanism that can be effortlessly integrated as a module into any convolutional neural network (CNN). Additionally, the computational cost of the self-attention mechanism has been effectively reduced, resulting in improved overall computational efficiency. Lastly, CDNet attains state-of-the-art performance on three benchmark datasets for similar types of image classification networks. In essence, CDNet addresses the constraints of conventional approaches and provides an efficient and effective solution for capturing global context in image classification tasks.https://www.mdpi.com/1424-8220/24/7/2055image classificationcomputer visionself-attention mechanismCNN
spellingShingle	Jiahang Lyu Rongxin Zou Qin Wan Wang Xi Qinglin Yang Sarath Kodagoda Shifeng Wang Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification Sensors image classification computer vision self-attention mechanism CNN
title	Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification
title_full	Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification
title_fullStr	Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification
title_full_unstemmed	Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification
title_short	Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification
title_sort	cross and diagonal networks an indirect self attention mechanism for image classification
topic	image classification computer vision self-attention mechanism CNN
url	https://www.mdpi.com/1424-8220/24/7/2055
work_keys_str_mv	AT jiahanglyu crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification AT rongxinzou crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification AT qinwan crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification AT wangxi crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification AT qinglinyang crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification AT sarathkodagoda crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification AT shifengwang crossanddiagonalnetworksanindirectselfattentionmechanismforimageclassification

Cross-and-Diagonal Networks: An Indirect Self-Attention Mechanism for Image Classification

Similar Items