Mixed Attention-Based CrossX Network for Satellite Image Classification

The classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most metho...

Full description

Bibliographic Details
Main Authors: Xiaofan Zhang, Yuhui Zheng
Format: Article
Language:English
Published: IEEE 2023-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10227553/
_version_ 1797687939607560192
author Xiaofan Zhang
Yuhui Zheng
author_facet Xiaofan Zhang
Yuhui Zheng
author_sort Xiaofan Zhang
collection DOAJ
description The classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most methods use convolutional neural networks (CNNs) to realize classification; however, common CNNs cannot accurately suppress background information while capturing key local characteristics of satellite images. In this article, we propose a scene classification algorithm for remote sensing images using the hybrid attention improvement network CrossX in remote sensing scenarios. A new hybrid attention module, consisting of a spatial attention (SA) module and a channel attention (CA) module, is introduced to fully extract salient features of the target. Specifically, the SA network aggregates features along two spatial directions to better understand the spatial relationships in the scene. In addition, a CA network using 1-D convolution is proposed to extract image features with a focus on capturing dependencies on channels. Distinctive characteristics of different semantic parts can be noticed from the original features, compensating for the lack of semantic information in the spatial dimension, and more efficient feature representations can be obtained by fusing these features. The proposed method has been proven on the following remote sensing scene datasets: UC Merced, AID, and NWPU-RESISC45. ResNet34, as the backbone network, achieves 99.25%, 96.52%, and 96.9% classification accuracies on the test sets. The experimental results show that our method outperforms current representative scene classifiers on both AID and NWPU, and its performance on UC Merced is comparable to that of state-of-the-art models. The proposed method focuses on improving the ability of the attention mechanism to extract features and obtain an efficient target feature representation, which can be used in computer vision tasks related to the extraction of features and the classification of remote sensing scenes.
first_indexed 2024-03-12T01:24:13Z
format Article
id doaj.art-426d3bd007704c38b0f147655f60d335
institution Directory Open Access Journal
issn 2151-1535
language English
last_indexed 2024-03-12T01:24:13Z
publishDate 2023-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj.art-426d3bd007704c38b0f147655f60d3352023-09-12T23:00:16ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352023-01-01168022803310.1109/JSTARS.2023.330803210227553Mixed Attention-Based CrossX Network for Satellite Image ClassificationXiaofan Zhang0https://orcid.org/0009-0005-3348-5205Yuhui Zheng1https://orcid.org/0000-0002-1709-3093School of Computers, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computers, Nanjing University of Information Science and Technology, Nanjing, ChinaThe classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most methods use convolutional neural networks (CNNs) to realize classification; however, common CNNs cannot accurately suppress background information while capturing key local characteristics of satellite images. In this article, we propose a scene classification algorithm for remote sensing images using the hybrid attention improvement network CrossX in remote sensing scenarios. A new hybrid attention module, consisting of a spatial attention (SA) module and a channel attention (CA) module, is introduced to fully extract salient features of the target. Specifically, the SA network aggregates features along two spatial directions to better understand the spatial relationships in the scene. In addition, a CA network using 1-D convolution is proposed to extract image features with a focus on capturing dependencies on channels. Distinctive characteristics of different semantic parts can be noticed from the original features, compensating for the lack of semantic information in the spatial dimension, and more efficient feature representations can be obtained by fusing these features. The proposed method has been proven on the following remote sensing scene datasets: UC Merced, AID, and NWPU-RESISC45. ResNet34, as the backbone network, achieves 99.25%, 96.52%, and 96.9% classification accuracies on the test sets. The experimental results show that our method outperforms current representative scene classifiers on both AID and NWPU, and its performance on UC Merced is comparable to that of state-of-the-art models. The proposed method focuses on improving the ability of the attention mechanism to extract features and obtain an efficient target feature representation, which can be used in computer vision tasks related to the extraction of features and the classification of remote sensing scenes.https://ieeexplore.ieee.org/document/10227553/Attention mechanismsatellite imagescene classification
spellingShingle Xiaofan Zhang
Yuhui Zheng
Mixed Attention-Based CrossX Network for Satellite Image Classification
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Attention mechanism
satellite image
scene classification
title Mixed Attention-Based CrossX Network for Satellite Image Classification
title_full Mixed Attention-Based CrossX Network for Satellite Image Classification
title_fullStr Mixed Attention-Based CrossX Network for Satellite Image Classification
title_full_unstemmed Mixed Attention-Based CrossX Network for Satellite Image Classification
title_short Mixed Attention-Based CrossX Network for Satellite Image Classification
title_sort mixed attention based crossx network for satellite image classification
topic Attention mechanism
satellite image
scene classification
url https://ieeexplore.ieee.org/document/10227553/
work_keys_str_mv AT xiaofanzhang mixedattentionbasedcrossxnetworkforsatelliteimageclassification
AT yuhuizheng mixedattentionbasedcrossxnetworkforsatelliteimageclassification