Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening

Pan-sharpening methods based on deep neural network (DNN) have produced state-of-the-art fusion performance. However, DNN-based methods mainly focus on the modeling of the local properties in low spatial resolution multispectral (LR MS) and panchromatic (PAN) images by convolution neural networks. T...

Full description

Bibliographic Details
Main Authors: Feng Zhang, Kai Zhang, Jiande Sun
Format: Article
Language:English
Published: MDPI AG 2022-04-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/7/1736
_version_ 1827622572930367488
author Feng Zhang
Kai Zhang
Jiande Sun
author_facet Feng Zhang
Kai Zhang
Jiande Sun
author_sort Feng Zhang
collection DOAJ
description Pan-sharpening methods based on deep neural network (DNN) have produced state-of-the-art fusion performance. However, DNN-based methods mainly focus on the modeling of the local properties in low spatial resolution multispectral (LR MS) and panchromatic (PAN) images by convolution neural networks. The global dependencies in the images are ignored. To capture the local and global properties of the images concurrently, we propose a multiscale spatial–spectral interaction transformer (MSIT) for pan-sharpening. Specifically, we construct the multiscale sub-networks containing convolution–transformer encoder to extract the local and global features at different scales from LR MS and PAN images, respectively. Then, a spatial–spectral interaction attention module (SIAM) is designed to merge the features at each scale. In SIAM, the interaction attention is used to decouple the spatial and spectral information efficiently for the enhancement of complementarity and the reduction of redundancy in the extracted features. The features from different scales are further integrated into a multiscale reconstruction module (MRM) to generate the desired high spatial resolution multispectral image, in which the spatial and spectral information is fused scale by scale. The experiments on reduced- and full-scale datasets demonstrate that the proposed MSIT can produce better results in terms of visual and numerical analysis when compared with state-of-the-art methods.
first_indexed 2024-03-09T11:27:19Z
format Article
id doaj.art-30e9c81811c54239ace9a42c408ed291
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T11:27:19Z
publishDate 2022-04-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-30e9c81811c54239ace9a42c408ed2912023-11-30T23:58:17ZengMDPI AGRemote Sensing2072-42922022-04-01147173610.3390/rs14071736Multiscale Spatial–Spectral Interaction Transformer for Pan-SharpeningFeng Zhang0Kai Zhang1Jiande Sun2School of Information Science and Engineering, Shandong Normal University, Ji’nan 250358, ChinaSchool of Information Science and Engineering, Shandong Normal University, Ji’nan 250358, ChinaSchool of Information Science and Engineering, Shandong Normal University, Ji’nan 250358, ChinaPan-sharpening methods based on deep neural network (DNN) have produced state-of-the-art fusion performance. However, DNN-based methods mainly focus on the modeling of the local properties in low spatial resolution multispectral (LR MS) and panchromatic (PAN) images by convolution neural networks. The global dependencies in the images are ignored. To capture the local and global properties of the images concurrently, we propose a multiscale spatial–spectral interaction transformer (MSIT) for pan-sharpening. Specifically, we construct the multiscale sub-networks containing convolution–transformer encoder to extract the local and global features at different scales from LR MS and PAN images, respectively. Then, a spatial–spectral interaction attention module (SIAM) is designed to merge the features at each scale. In SIAM, the interaction attention is used to decouple the spatial and spectral information efficiently for the enhancement of complementarity and the reduction of redundancy in the extracted features. The features from different scales are further integrated into a multiscale reconstruction module (MRM) to generate the desired high spatial resolution multispectral image, in which the spatial and spectral information is fused scale by scale. The experiments on reduced- and full-scale datasets demonstrate that the proposed MSIT can produce better results in terms of visual and numerical analysis when compared with state-of-the-art methods.https://www.mdpi.com/2072-4292/14/7/1736pan-sharpeningmultispectral imagepanchromatic imagemultiscale transformerspatial–spectral interaction attention
spellingShingle Feng Zhang
Kai Zhang
Jiande Sun
Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening
Remote Sensing
pan-sharpening
multispectral image
panchromatic image
multiscale transformer
spatial–spectral interaction attention
title Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening
title_full Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening
title_fullStr Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening
title_full_unstemmed Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening
title_short Multiscale Spatial–Spectral Interaction Transformer for Pan-Sharpening
title_sort multiscale spatial spectral interaction transformer for pan sharpening
topic pan-sharpening
multispectral image
panchromatic image
multiscale transformer
spatial–spectral interaction attention
url https://www.mdpi.com/2072-4292/14/7/1736
work_keys_str_mv AT fengzhang multiscalespatialspectralinteractiontransformerforpansharpening
AT kaizhang multiscalespatialspectralinteractiontransformerforpansharpening
AT jiandesun multiscalespatialspectralinteractiontransformerforpansharpening