Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer

The distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classificati...

Full description

Bibliographic Details
Main Authors:	Quan Gu, Hongkang Luan, Kaixuan Huang, Yubao Sun
Format:	Article
Language:	English
Published:	MDPI AG 2024-02-01
Series:	Electronics
Subjects:	hyperspectral image classification multi-scale spectral attention Transformer long-range spectral dependence
Online Access:	https://www.mdpi.com/2079-9292/13/5/949

_version_	1797264663027646464
author	Quan Gu Hongkang Luan Kaixuan Huang Yubao Sun
author_facet	Quan Gu Hongkang Luan Kaixuan Huang Yubao Sun
author_sort	Quan Gu
collection	DOAJ
description	The distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classification accuracy of HSI. However, CNNs are not successful in capturing long-range spectral–spatial dependence. In recent years, Vision Transformer (VIT) has received widespread attention due to its excellent performance in acquiring long-range features. However, it requires calculating the pairwise correlation between token embeddings and has the complexity of the square of the number of tokens, which leads to an increase in the computational complexity of the network. In order to cope with this issue, this paper proposes a multi-scale spectral–spatial attention network with frequency-domain lightweight Transformer (MSA-LWFormer) for HSI classification. This method synergistically integrates CNN, attention mechanisms, and Transformer into the spectral–spatial feature extraction module and frequency-domain fused classification module. Specifically, the spectral–spatial feature extraction module employs a multi-scale 2D-CNN with multi-scale spectral attention (MS-SA) to extract the shallow spectral–spatial features and capture the long-range spectral dependence. In addition, The frequency-domain fused classification module designs a frequency-domain lightweight Transformer that employs the Fast Fourier Transform (FFT) to convert features from the spatial domain to the frequency domain, effectively extracting global information and significantly reducing the time complexity of the network. Experiments on three classic hyperspectral datasets show that MSA-LWFormer has excellent performance.
first_indexed	2024-04-25T00:32:28Z
format	Article
id	doaj.art-a94ab172d49843ce8b511f966874034e
institution	Directory Open Access Journal
issn	2079-9292
language	English
last_indexed	2024-04-25T00:32:28Z
publishDate	2024-02-01
publisher	MDPI AG
record_format	Article
series	Electronics
spelling	doaj.art-a94ab172d49843ce8b511f966874034e2024-03-12T16:42:42ZengMDPI AGElectronics2079-92922024-02-0113594910.3390/electronics13050949Hyperspectral Image Classification Using Multi-Scale Lightweight TransformerQuan Gu0Hongkang Luan1Kaixuan Huang2Yubao Sun3Engineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaEngineering Research Center of Digital Forensics, Ministry of Education, Jiangsu Collaborative Innovation Center of Atmospheric Environment and Equipment Technology, Nanjing University of Information Science and Technology, Nanjing 210044, ChinaThe distinctive feature of hyperspectral images (HSIs) is their large number of spectral bands, which allows us to identify categories of ground objects by capturing discrepancies in spectral information. Convolutional neural networks (CNN) with attention modules effectively improve the classification accuracy of HSI. However, CNNs are not successful in capturing long-range spectral–spatial dependence. In recent years, Vision Transformer (VIT) has received widespread attention due to its excellent performance in acquiring long-range features. However, it requires calculating the pairwise correlation between token embeddings and has the complexity of the square of the number of tokens, which leads to an increase in the computational complexity of the network. In order to cope with this issue, this paper proposes a multi-scale spectral–spatial attention network with frequency-domain lightweight Transformer (MSA-LWFormer) for HSI classification. This method synergistically integrates CNN, attention mechanisms, and Transformer into the spectral–spatial feature extraction module and frequency-domain fused classification module. Specifically, the spectral–spatial feature extraction module employs a multi-scale 2D-CNN with multi-scale spectral attention (MS-SA) to extract the shallow spectral–spatial features and capture the long-range spectral dependence. In addition, The frequency-domain fused classification module designs a frequency-domain lightweight Transformer that employs the Fast Fourier Transform (FFT) to convert features from the spatial domain to the frequency domain, effectively extracting global information and significantly reducing the time complexity of the network. Experiments on three classic hyperspectral datasets show that MSA-LWFormer has excellent performance.https://www.mdpi.com/2079-9292/13/5/949hyperspectral image classificationmulti-scale spectral attentionTransformerlong-range spectral dependence
spellingShingle	Quan Gu Hongkang Luan Kaixuan Huang Yubao Sun Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer Electronics hyperspectral image classification multi-scale spectral attention Transformer long-range spectral dependence
title	Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_full	Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_fullStr	Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_full_unstemmed	Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_short	Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer
title_sort	hyperspectral image classification using multi scale lightweight transformer
topic	hyperspectral image classification multi-scale spectral attention Transformer long-range spectral dependence
url	https://www.mdpi.com/2079-9292/13/5/949
work_keys_str_mv	AT quangu hyperspectralimageclassificationusingmultiscalelightweighttransformer AT hongkangluan hyperspectralimageclassificationusingmultiscalelightweighttransformer AT kaixuanhuang hyperspectralimageclassificationusingmultiscalelightweighttransformer AT yubaosun hyperspectralimageclassificationusingmultiscalelightweighttransformer

Hyperspectral Image Classification Using Multi-Scale Lightweight Transformer

Similar Items