MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion

Infrared and visible image fusion is to combine the information of thermal radiation and detailed texture from the two images into one informative fused image. Recently, deep learning methods have been widely applied in this task; however, those methods usually fuse multiple extracted features with...

Full description

Bibliographic Details
Main Authors: Xiangzeng Liu, Haojie Gao, Qiguang Miao, Yue Xi, Yunfeng Ai, Dingguo Gao
Format: Article
Language:English
Published: MDPI AG 2022-07-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/14/13/3233
_version_ 1797442038638051328
author Xiangzeng Liu
Haojie Gao
Qiguang Miao
Yue Xi
Yunfeng Ai
Dingguo Gao
author_facet Xiangzeng Liu
Haojie Gao
Qiguang Miao
Yue Xi
Yunfeng Ai
Dingguo Gao
author_sort Xiangzeng Liu
collection DOAJ
description Infrared and visible image fusion is to combine the information of thermal radiation and detailed texture from the two images into one informative fused image. Recently, deep learning methods have been widely applied in this task; however, those methods usually fuse multiple extracted features with the same fusion strategy, which ignores the differences in the representation of these features, resulting in the loss of information in the fusion process. To address this issue, we propose a novel method named multi-modal feature self-adaptive transformer (MFST) to preserve more significant information about the source images. Firstly, multi-modal features are extracted from the input images by a convolutional neural network (CNN). Then, these features are fused by the focal transformer blocks that can be trained through an adaptive fusion strategy according to the characteristics of different features. Finally, the fused features and saliency information of the infrared image are considered to obtain the fused image. The proposed fusion framework is evaluated on TNO, LLVIP, and FLIR datasets with various scenes. Experimental results demonstrate that our method outperforms several state-of-the-art methods in terms of subjective and objective evaluation.
first_indexed 2024-03-09T12:35:42Z
format Article
id doaj.art-3631e24c53ac401fa5d13eb84c56a979
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-09T12:35:42Z
publishDate 2022-07-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-3631e24c53ac401fa5d13eb84c56a9792023-11-30T22:24:03ZengMDPI AGRemote Sensing2072-42922022-07-011413323310.3390/rs14133233MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image FusionXiangzeng Liu0Haojie Gao1Qiguang Miao2Yue Xi3Yunfeng Ai4Dingguo Gao5School of Computer Science and Technology, Xidian University, Xi’an 710071, ChinaGuangzhou Institute of Technology, Xidian University, Xi’an 510555, ChinaSchool of Computer Science and Technology, Xidian University, Xi’an 710071, ChinaGuangzhou Institute of Technology, Xidian University, Xi’an 510555, ChinaSchool of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, ChinaSchool of Information of Science and Technology, Tibet University, Lhasa 850000, ChinaInfrared and visible image fusion is to combine the information of thermal radiation and detailed texture from the two images into one informative fused image. Recently, deep learning methods have been widely applied in this task; however, those methods usually fuse multiple extracted features with the same fusion strategy, which ignores the differences in the representation of these features, resulting in the loss of information in the fusion process. To address this issue, we propose a novel method named multi-modal feature self-adaptive transformer (MFST) to preserve more significant information about the source images. Firstly, multi-modal features are extracted from the input images by a convolutional neural network (CNN). Then, these features are fused by the focal transformer blocks that can be trained through an adaptive fusion strategy according to the characteristics of different features. Finally, the fused features and saliency information of the infrared image are considered to obtain the fused image. The proposed fusion framework is evaluated on TNO, LLVIP, and FLIR datasets with various scenes. Experimental results demonstrate that our method outperforms several state-of-the-art methods in terms of subjective and objective evaluation.https://www.mdpi.com/2072-4292/14/13/3233infrared imagevisible imagetransformerimage fusionmulti-modal featurefocal self-attention
spellingShingle Xiangzeng Liu
Haojie Gao
Qiguang Miao
Yue Xi
Yunfeng Ai
Dingguo Gao
MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
Remote Sensing
infrared image
visible image
transformer
image fusion
multi-modal feature
focal self-attention
title MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_full MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_fullStr MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_full_unstemmed MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_short MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_sort mfst multi modal feature self adaptive transformer for infrared and visible image fusion
topic infrared image
visible image
transformer
image fusion
multi-modal feature
focal self-attention
url https://www.mdpi.com/2072-4292/14/13/3233
work_keys_str_mv AT xiangzengliu mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion
AT haojiegao mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion
AT qiguangmiao mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion
AT yuexi mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion
AT yunfengai mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion
AT dingguogao mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion