Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
The distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classificati...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2024-02-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/13/5/949 |
_version_ | 1797264663027646464 |
---|---|
author | Quan Gu Hongkang Luan Kaixuan Huang Yubao Sun |
author_facet | Quan Gu Hongkang Luan Kaixuan Huang Yubao Sun |
author_sort | Quan Gu |
collection | DOAJ |
description | The distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classification accuracy of HSI. However, CNNs are not successful in capturing long-range spectral–spatial dependence. In recent years, Vision Transformer (VIT) has received widespread attention due to its excellent performance in acquiring long-range features. However, it requires calculating the pairwise correlation between token embeddings and has the complexity of the square of the number of tokens, which leads to an increase in the computational complexity of the network. In order to cope with this issue, this paper proposes a multi-scale spectral–spatial attention network with frequency-domain lightweight Transformer (MSA-LWFormer) for HSI classification. This method synergistically integrates CNN, attention mechanisms, and Transformer into the spectral–spatial feature extraction module and frequency-domain fused classification module. Specifically, the spectral–spatial feature extraction module employs a multi-scale 2D-CNN with multi-scale spectral attention (MS-SA) to extract the shallow spectral–spatial features and capture the long-range spectral dependence. In addition, The frequency-domain fused classification module designs a frequency-domain lightweight Transformer that employs the Fast Fourier Transform (FFT) to convert features from the spatial domain to the frequency domain, effectively extracting global information and significantly reducing the time complexity of the network. Experiments on three classic hyperspectral datasets show that MSA-LWFormer has excellent performance. |
first_indexed | 2024-04-25T00:32:28Z |
format | Article |
id | doaj.art-a94ab172d49843ce8b511f966874034e |
institution | Directory Open Access Journal |
issn | 2079-9292 |
language | English |
last_indexed | 2024-04-25T00:32:28Z |
publishDate | 2024-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Electronics |
spelling | doaj.art-a94ab172d49843ce8b511f966874034e2024-03-12T16:42:42ZengMDPI AGElectronics2079-92922024-02-0113594910.3390/electronics13050949Hyperspectral Image Classification Using Multi-Scale Lightweight TransformerQuan Gu0Hongkang Luan1Kaixuan Huang2Yubao Sun3Engineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaThe distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classification accuracy of HSI. However, CNNs are not successful in capturing long-range spectral–spatial dependence. In recent years, Vision Transformer (VIT) has received widespread attention due to its excellent performance in acquiring long-range features. However, it requires calculating the pairwise correlation between token embeddings and has the complexity of the square of the number of tokens, which leads to an increase in the computational complexity of the network. In order to cope with this issue, this paper proposes a multi-scale spectral–spatial attention network with frequency-domain lightweight Transformer (MSA-LWFormer) for HSI classification. This method synergistically integrates CNN, attention mechanisms, and Transformer into the spectral–spatial feature extraction module and frequency-domain fused classification module. Specifically, the spectral–spatial feature extraction module employs a multi-scale 2D-CNN with multi-scale spectral attention (MS-SA) to extract the shallow spectral–spatial features and capture the long-range spectral dependence. In addition, The frequency-domain fused classification module designs a frequency-domain lightweight Transformer that employs the Fast Fourier Transform (FFT) to convert features from the spatial domain to the frequency domain, effectively extracting global information and significantly reducing the time complexity of the network. Experiments on three classic hyperspectral datasets show that MSA-LWFormer has excellent performance.https://www.mdpi.com/2079-9292/13/5/949hyperspectral image classificationmulti-scale spectral attentionTransformerlong-range spectral dependence |
spellingShingle | Quan Gu Hongkang Luan Kaixuan Huang Yubao Sun Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer Electronics hyperspectral image classification multi-scale spectral attention Transformer long-range spectral dependence |
title | Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer |
title_full | Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer |
title_fullStr | Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer |
title_full_unstemmed | Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer |
title_short | Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer |
title_sort | hyperspectral image classification using multi scale lightweight transformer |
topic | hyperspectral image classification multi-scale spectral attention Transformer long-range spectral dependence |
url | https://www.mdpi.com/2079-9292/13/5/949 |
work_keys_str_mv | AT quangu hyperspectralimageclassificationusingmultiscalelightweighttransformer AT hongkangluan hyperspectralimageclassificationusingmultiscalelightweighttransformer AT kaixuanhuang hyperspectralimageclassificationusingmultiscalelightweighttransformer AT yubaosun hyperspectralimageclassificationusingmultiscalelightweighttransformer |