Mixed Attention-Based CrossX Network for Satellite Image Classification

The classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most metho...

Full description

Bibliographic Details
Main Authors:	Xiaofan Zhang, Yuhui Zheng
Format:	Article
Language:	English
Published:	IEEE 2023-01-01
Series:	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:	Attention mechanism satellite image scene classification
Online Access:	https://ieeexplore.ieee.org/document/10227553/

_version_	1797687939607560192
author	Xiaofan Zhang Yuhui Zheng
author_facet	Xiaofan Zhang Yuhui Zheng
author_sort	Xiaofan Zhang
collection	DOAJ
description	The classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most methods use convolutional neural networks (CNNs) to realize classification; however, common CNNs cannot accurately suppress background information while capturing key local characteristics of satellite images. In this article, we propose a scene classification algorithm for remote sensing images using the hybrid attention improvement network CrossX in remote sensing scenarios. A new hybrid attention module, consisting of a spatial attention (SA) module and a channel attention (CA) module, is introduced to fully extract salient features of the target. Specifically, the SA network aggregates features along two spatial directions to better understand the spatial relationships in the scene. In addition, a CA network using 1-D convolution is proposed to extract image features with a focus on capturing dependencies on channels. Distinctive characteristics of different semantic parts can be noticed from the original features, compensating for the lack of semantic information in the spatial dimension, and more efficient feature representations can be obtained by fusing these features. The proposed method has been proven on the following remote sensing scene datasets: UC Merced, AID, and NWPU-RESISC45. ResNet34, as the backbone network, achieves 99.25%, 96.52%, and 96.9% classification accuracies on the test sets. The experimental results show that our method outperforms current representative scene classifiers on both AID and NWPU, and its performance on UC Merced is comparable to that of state-of-the-art models. The proposed method focuses on improving the ability of the attention mechanism to extract features and obtain an efficient target feature representation, which can be used in computer vision tasks related to the extraction of features and the classification of remote sensing scenes.
first_indexed	2024-03-12T01:24:13Z
format	Article
id	doaj.art-426d3bd007704c38b0f147655f60d335
institution	Directory Open Access Journal
issn	2151-1535
language	English
last_indexed	2024-03-12T01:24:13Z
publishDate	2023-01-01
publisher	IEEE
record_format	Article
series	IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling	doaj.art-426d3bd007704c38b0f147655f60d3352023-09-12T23:00:16ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352023-01-01168022803310.1109/JSTARS.2023.330803210227553Mixed Attention-Based CrossX Network for Satellite Image ClassificationXiaofan Zhang0https://orcid.org/0009-0005-3348-5205Yuhui Zheng1https://orcid.org/0000-0002-1709-3093School of Computers, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computers, Nanjing University of Information Science and Technology, Nanjing, ChinaThe classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most methods use convolutional neural networks (CNNs) to realize classification; however, common CNNs cannot accurately suppress background information while capturing key local characteristics of satellite images. In this article, we propose a scene classification algorithm for remote sensing images using the hybrid attention improvement network CrossX in remote sensing scenarios. A new hybrid attention module, consisting of a spatial attention (SA) module and a channel attention (CA) module, is introduced to fully extract salient features of the target. Specifically, the SA network aggregates features along two spatial directions to better understand the spatial relationships in the scene. In addition, a CA network using 1-D convolution is proposed to extract image features with a focus on capturing dependencies on channels. Distinctive characteristics of different semantic parts can be noticed from the original features, compensating for the lack of semantic information in the spatial dimension, and more efficient feature representations can be obtained by fusing these features. The proposed method has been proven on the following remote sensing scene datasets: UC Merced, AID, and NWPU-RESISC45. ResNet34, as the backbone network, achieves 99.25%, 96.52%, and 96.9% classification accuracies on the test sets. The experimental results show that our method outperforms current representative scene classifiers on both AID and NWPU, and its performance on UC Merced is comparable to that of state-of-the-art models. The proposed method focuses on improving the ability of the attention mechanism to extract features and obtain an efficient target feature representation, which can be used in computer vision tasks related to the extraction of features and the classification of remote sensing scenes.https://ieeexplore.ieee.org/document/10227553/Attention mechanismsatellite imagescene classification
spellingShingle	Xiaofan Zhang Yuhui Zheng Mixed Attention-Based CrossX Network for Satellite Image Classification IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Attention mechanism satellite image scene classification
title	Mixed Attention-Based CrossX Network for Satellite Image Classification
title_full	Mixed Attention-Based CrossX Network for Satellite Image Classification
title_fullStr	Mixed Attention-Based CrossX Network for Satellite Image Classification
title_full_unstemmed	Mixed Attention-Based CrossX Network for Satellite Image Classification
title_short	Mixed Attention-Based CrossX Network for Satellite Image Classification
title_sort	mixed attention based crossx network for satellite image classification
topic	Attention mechanism satellite image scene classification
url	https://ieeexplore.ieee.org/document/10227553/
work_keys_str_mv	AT xiaofanzhang mixedattentionbasedcrossxnetworkforsatelliteimageclassification AT yuhuizheng mixedattentionbasedcrossxnetworkforsatelliteimageclassification

Mixed Attention-Based CrossX Network for Satellite Image Classification

Similar Items