Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer

The distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classificati...

Full description

Bibliographic Details
Main Authors: Quan Gu, Hongkang Luan, Kaixuan Huang, Yubao Sun
Format: Article
Language:English
Published: MDPI AG 2024-02-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/13/5/949
_version_ 1797264663027646464
author Quan Gu
Hongkang Luan
Kaixuan Huang
Yubao Sun
author_facet Quan Gu
Hongkang Luan
Kaixuan Huang
Yubao Sun
author_sort Quan Gu
collection DOAJ
description The distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classification accuracy of HSI. However, CNNs are not successful in capturing long-range spectral–spatial dependence. In recent years, Vision Transformer (VIT) has received widespread attention due to its excellent performance in acquiring long-range features. However, it requires calculating the pairwise correlation between token embeddings and has the complexity of the square of the number of tokens, which leads to an increase in the computational complexity of the network. In order to cope with this issue, this paper proposes a multi-scale spectral–spatial attention network with frequency-domain lightweight Transformer (MSA-LWFormer) for HSI classification. This method synergistically integrates CNN, attention mechanisms, and Transformer into the spectral–spatial feature extraction module and frequency-domain fused classification module. Specifically, the spectral–spatial feature extraction module employs a multi-scale 2D-CNN with multi-scale spectral attention (MS-SA) to extract the shallow spectral–spatial features and capture the long-range spectral dependence. In addition, The frequency-domain fused classification module designs a frequency-domain lightweight Transformer that employs the Fast Fourier Transform (FFT) to convert features from the spatial domain to the frequency domain, effectively extracting global information and significantly reducing the time complexity of the network. Experiments on three classic hyperspectral datasets show that MSA-LWFormer has excellent performance.
first_indexed 2024-04-25T00:32:28Z
format Article
id doaj.art-a94ab172d49843ce8b511f966874034e
institution Directory Open Access Journal
issn 2079-9292
language English
last_indexed 2024-04-25T00:32:28Z
publishDate 2024-02-01
publisher MDPI AG
record_format Article
series Electronics
spelling doaj.art-a94ab172d49843ce8b511f966874034e2024-03-12T16:42:42ZengMDPI AGElectronics2079-92922024-02-0113594910.3390/electronics13050949Hyperspectral Image Classification Using Multi-Scale Lightweight TransformerQuan Gu0Hongkang Luan1Kaixuan Huang2Yubao Sun3Engineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaThe distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classification accuracy of HSI. However, CNNs are not successful in capturing long-range spectral–spatial dependence. In recent years, Vision Transformer (VIT) has received widespread attention due to its excellent performance in acquiring long-range features. However, it requires calculating the pairwise correlation between token embeddings and has the complexity of the square of the number of tokens, which leads to an increase in the computational complexity of the network. In order to cope with this issue, this paper proposes a multi-scale spectral–spatial attention network with frequency-domain lightweight Transformer (MSA-LWFormer) for HSI classification. This method synergistically integrates CNN, attention mechanisms, and Transformer into the spectral–spatial feature extraction module and frequency-domain fused classification module. Specifically, the spectral–spatial feature extraction module employs a multi-scale 2D-CNN with multi-scale spectral attention (MS-SA) to extract the shallow spectral–spatial features and capture the long-range spectral dependence. In addition, The frequency-domain fused classification module designs a frequency-domain lightweight Transformer that employs the Fast Fourier Transform (FFT) to convert features from the spatial domain to the frequency domain, effectively extracting global information and significantly reducing the time complexity of the network. Experiments on three classic hyperspectral datasets show that MSA-LWFormer has excellent performance.https://www.mdpi.com/2079-9292/13/5/949hyperspectral image classificationmulti-scale spectral attentionTransformerlong-range spectral dependence
spellingShingle Quan Gu
Hongkang Luan
Kaixuan Huang
Yubao Sun
Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
Electronics
hyperspectral image classification
multi-scale spectral attention
Transformer
long-range spectral dependence
title Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_full Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_fullStr Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_full_unstemmed Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_short Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_sort hyperspectral image classification using multi scale lightweight transformer
topic hyperspectral image classification
multi-scale spectral attention
Transformer
long-range spectral dependence
url https://www.mdpi.com/2079-9292/13/5/949
work_keys_str_mv AT quangu hyperspectralimageclassificationusingmultiscalelightweighttransformer
AT hongkangluan hyperspectralimageclassificationusingmultiscalelightweighttransformer
AT kaixuanhuang hyperspectralimageclassificationusingmultiscalelightweighttransformer
AT yubaosun hyperspectralimageclassificationusingmultiscalelightweighttransformer