Mixed Attention-Based CrossX Network for Satellite Image Classification
The classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most metho...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2023-01-01
|
Series: | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/10227553/ |
_version_ | 1797687939607560192 |
---|---|
author | Xiaofan Zhang Yuhui Zheng |
author_facet | Xiaofan Zhang Yuhui Zheng |
author_sort | Xiaofan Zhang |
collection | DOAJ |
description | The classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most methods use convolutional neural networks (CNNs) to realize classification; however, common CNNs cannot accurately suppress background information while capturing key local characteristics of satellite images. In this article, we propose a scene classification algorithm for remote sensing images using the hybrid attention improvement network CrossX in remote sensing scenarios. A new hybrid attention module, consisting of a spatial attention (SA) module and a channel attention (CA) module, is introduced to fully extract salient features of the target. Specifically, the SA network aggregates features along two spatial directions to better understand the spatial relationships in the scene. In addition, a CA network using 1-D convolution is proposed to extract image features with a focus on capturing dependencies on channels. Distinctive characteristics of different semantic parts can be noticed from the original features, compensating for the lack of semantic information in the spatial dimension, and more efficient feature representations can be obtained by fusing these features. The proposed method has been proven on the following remote sensing scene datasets: UC Merced, AID, and NWPU-RESISC45. ResNet34, as the backbone network, achieves 99.25%, 96.52%, and 96.9% classification accuracies on the test sets. The experimental results show that our method outperforms current representative scene classifiers on both AID and NWPU, and its performance on UC Merced is comparable to that of state-of-the-art models. The proposed method focuses on improving the ability of the attention mechanism to extract features and obtain an efficient target feature representation, which can be used in computer vision tasks related to the extraction of features and the classification of remote sensing scenes. |
first_indexed | 2024-03-12T01:24:13Z |
format | Article |
id | doaj.art-426d3bd007704c38b0f147655f60d335 |
institution | Directory Open Access Journal |
issn | 2151-1535 |
language | English |
last_indexed | 2024-03-12T01:24:13Z |
publishDate | 2023-01-01 |
publisher | IEEE |
record_format | Article |
series | IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing |
spelling | doaj.art-426d3bd007704c38b0f147655f60d3352023-09-12T23:00:16ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352023-01-01168022803310.1109/JSTARS.2023.330803210227553Mixed Attention-Based CrossX Network for Satellite Image ClassificationXiaofan Zhang0https://orcid.org/0009-0005-3348-5205Yuhui Zheng1https://orcid.org/0000-0002-1709-3093School of Computers, Nanjing University of Information Science and Technology, Nanjing, ChinaSchool of Computers, Nanjing University of Information Science and Technology, Nanjing, ChinaThe classification of remote sensing scenes is always a challenging task due to the large range of variation in the data, high spatial resolutions, and complex backgrounds. In the analysis and interpretation of satellite images, remote sensing scene classification plays an important role. Most methods use convolutional neural networks (CNNs) to realize classification; however, common CNNs cannot accurately suppress background information while capturing key local characteristics of satellite images. In this article, we propose a scene classification algorithm for remote sensing images using the hybrid attention improvement network CrossX in remote sensing scenarios. A new hybrid attention module, consisting of a spatial attention (SA) module and a channel attention (CA) module, is introduced to fully extract salient features of the target. Specifically, the SA network aggregates features along two spatial directions to better understand the spatial relationships in the scene. In addition, a CA network using 1-D convolution is proposed to extract image features with a focus on capturing dependencies on channels. Distinctive characteristics of different semantic parts can be noticed from the original features, compensating for the lack of semantic information in the spatial dimension, and more efficient feature representations can be obtained by fusing these features. The proposed method has been proven on the following remote sensing scene datasets: UC Merced, AID, and NWPU-RESISC45. ResNet34, as the backbone network, achieves 99.25%, 96.52%, and 96.9% classification accuracies on the test sets. The experimental results show that our method outperforms current representative scene classifiers on both AID and NWPU, and its performance on UC Merced is comparable to that of state-of-the-art models. The proposed method focuses on improving the ability of the attention mechanism to extract features and obtain an efficient target feature representation, which can be used in computer vision tasks related to the extraction of features and the classification of remote sensing scenes.https://ieeexplore.ieee.org/document/10227553/Attention mechanismsatellite imagescene classification |
spellingShingle | Xiaofan Zhang Yuhui Zheng Mixed Attention-Based CrossX Network for Satellite Image Classification IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing Attention mechanism satellite image scene classification |
title | Mixed Attention-Based CrossX Network for Satellite Image Classification |
title_full | Mixed Attention-Based CrossX Network for Satellite Image Classification |
title_fullStr | Mixed Attention-Based CrossX Network for Satellite Image Classification |
title_full_unstemmed | Mixed Attention-Based CrossX Network for Satellite Image Classification |
title_short | Mixed Attention-Based CrossX Network for Satellite Image Classification |
title_sort | mixed attention based crossx network for satellite image classification |
topic | Attention mechanism satellite image scene classification |
url | https://ieeexplore.ieee.org/document/10227553/ |
work_keys_str_mv | AT xiaofanzhang mixedattentionbasedcrossxnetworkforsatelliteimageclassification AT yuhuizheng mixedattentionbasedcrossxnetworkforsatelliteimageclassification |