MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification

The joint use of hyperspectral image (HSI) and light detection and ranging (LiDAR) data has gained significant performance on land-cover classification. Although spatial–spectral feature learning methods based on convolutional neural networks and transformer networks have achieved promine...

Full description

Bibliographic Details
Main Authors: Kang Ni, Duo Wang, Zhizhong Zheng, Peng Wang
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10438852/
_version_ 1797267995516469248
author Kang Ni
Duo Wang
Zhizhong Zheng
Peng Wang
author_facet Kang Ni
Duo Wang
Zhizhong Zheng
Peng Wang
author_sort Kang Ni
collection DOAJ
description The joint use of hyperspectral image (HSI) and light detection and ranging (LiDAR) data has gained significant performance on land-cover classification. Although spatial–spectral feature learning methods based on convolutional neural networks and transformer networks have achieved prominent advances, contextual information described by fixed convolutional kernels and all self-attention heads selected have limited ability to characterize the detailed information and nonredundant features of land-covers on multimodal data. In this article, a multiscale head selection transformer (MHST) network, is proposed to fully explore detailed and nonredundant features in spatial and spectral dimensions of HSI and LiDAR data. To better acquire detailed information of spatial and spectral features at different scales, a multiscale spectral–spatial feature extraction module, including cascaded multiscale 3-D and 2-D convolutional layers, is inserted into MHST. Simultaneously, an adaptive global feature extraction module based on head selection pooling transformer is given after transformer encoder module for alleviating token redundancy in an adaptive computation style. Finally, we develop a multimodal–multiscale feature fusion classification module with local features and global class token, to exploit a powerful global–local fuse style. The extensive experiments on three popular datasets demonstrate that MHST significantly outperforms other related networks.
first_indexed 2024-03-07T14:32:42Z
format Article
id doaj.art-24a03f053cff4ceebb1be98aa2d50c5b
institution Directory Open Access Journal
issn 2151-1535
language English
last_indexed 2024-04-25T01:25:27Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
spelling doaj.art-24a03f053cff4ceebb1be98aa2d50c5b2024-03-09T00:00:08ZengIEEEIEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing2151-15352024-01-01175470548310.1109/JSTARS.2024.336661410438852MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR ClassificationKang Ni0https://orcid.org/0000-0003-1026-2074Duo Wang1https://orcid.org/0009-0002-9451-4231Zhizhong Zheng2https://orcid.org/0009-0002-3845-9602Peng Wang3https://orcid.org/0000-0002-3825-6365School of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, ChinaCollege of Automation and College of Artificial Intelligence, Nanjing University of Posts and Telecommunications, Nanjing, ChinaSchool of Computer Science, Nanjing University of Posts and Telecommunications, Nanjing, ChinaKey Laboratory of Radar Imaging and Microwave Photonics, Ministry of Education, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaThe joint use of hyperspectral image (HSI) and light detection and ranging (LiDAR) data has gained significant performance on land-cover classification. Although spatial–spectral feature learning methods based on convolutional neural networks and transformer networks have achieved prominent advances, contextual information described by fixed convolutional kernels and all self-attention heads selected have limited ability to characterize the detailed information and nonredundant features of land-covers on multimodal data. In this article, a multiscale head selection transformer (MHST) network, is proposed to fully explore detailed and nonredundant features in spatial and spectral dimensions of HSI and LiDAR data. To better acquire detailed information of spatial and spectral features at different scales, a multiscale spectral–spatial feature extraction module, including cascaded multiscale 3-D and 2-D convolutional layers, is inserted into MHST. Simultaneously, an adaptive global feature extraction module based on head selection pooling transformer is given after transformer encoder module for alleviating token redundancy in an adaptive computation style. Finally, we develop a multimodal–multiscale feature fusion classification module with local features and global class token, to exploit a powerful global–local fuse style. The extensive experiments on three popular datasets demonstrate that MHST significantly outperforms other related networks.https://ieeexplore.ieee.org/document/10438852/Classificationfeature learningglobal class tokenhyperspectral image (HSI)light detection and ranging (LiDAR) datatransformer
spellingShingle Kang Ni
Duo Wang
Zhizhong Zheng
Peng Wang
MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing
Classification
feature learning
global class token
hyperspectral image (HSI)
light detection and ranging (LiDAR) data
transformer
title MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification
title_full MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification
title_fullStr MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification
title_full_unstemmed MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification
title_short MHST: Multiscale Head Selection Transformer for Hyperspectral and LiDAR Classification
title_sort mhst multiscale head selection transformer for hyperspectral and lidar classification
topic Classification
feature learning
global class token
hyperspectral image (HSI)
light detection and ranging (LiDAR) data
transformer
url https://ieeexplore.ieee.org/document/10438852/
work_keys_str_mv AT kangni mhstmultiscaleheadselectiontransformerforhyperspectralandlidarclassification
AT duowang mhstmultiscaleheadselectiontransformerforhyperspectralandlidarclassification
AT zhizhongzheng mhstmultiscaleheadselectiontransformerforhyperspectralandlidarclassification
AT pengwang mhstmultiscaleheadselectiontransformerforhyperspectralandlidarclassification