MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion

Infrared and visible image fusion is to combine the information of thermal radiation and detailed texture from the two images into one informative fused image. Recently, deep learning methods have been widely applied in this task; however, those methods usually fuse multiple extracted features with...

Full description

Bibliographic Details
Main Authors:	Xiangzeng Liu, Haojie Gao, Qiguang Miao, Yue Xi, Yunfeng Ai, Dingguo Gao
Format:	Article
Language:	English
Published:	MDPI AG 2022-07-01
Series:	Remote Sensing
Subjects:	infrared image visible image transformer image fusion multi-modal feature focal self-attention
Online Access:	https://www.mdpi.com/2072-4292/14/13/3233

_version_	1797442038638051328
author	Xiangzeng Liu Haojie Gao Qiguang Miao Yue Xi Yunfeng Ai Dingguo Gao
author_facet	Xiangzeng Liu Haojie Gao Qiguang Miao Yue Xi Yunfeng Ai Dingguo Gao
author_sort	Xiangzeng Liu
collection	DOAJ
description	Infrared and visible image fusion is to combine the information of thermal radiation and detailed texture from the two images into one informative fused image. Recently, deep learning methods have been widely applied in this task; however, those methods usually fuse multiple extracted features with the same fusion strategy, which ignores the differences in the representation of these features, resulting in the loss of information in the fusion process. To address this issue, we propose a novel method named multi-modal feature self-adaptive transformer (MFST) to preserve more significant information about the source images. Firstly, multi-modal features are extracted from the input images by a convolutional neural network (CNN). Then, these features are fused by the focal transformer blocks that can be trained through an adaptive fusion strategy according to the characteristics of different features. Finally, the fused features and saliency information of the infrared image are considered to obtain the fused image. The proposed fusion framework is evaluated on TNO, LLVIP, and FLIR datasets with various scenes. Experimental results demonstrate that our method outperforms several state-of-the-art methods in terms of subjective and objective evaluation.
first_indexed	2024-03-09T12:35:42Z
format	Article
id	doaj.art-3631e24c53ac401fa5d13eb84c56a979
institution	Directory Open Access Journal
issn	2072-4292
language	English
last_indexed	2024-03-09T12:35:42Z
publishDate	2022-07-01
publisher	MDPI AG
record_format	Article
series	Remote Sensing
spelling	doaj.art-3631e24c53ac401fa5d13eb84c56a9792023-11-30T22:24:03ZengMDPI AGRemote Sensing2072-42922022-07-011413323310.3390/rs14133233MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image FusionXiangzeng Liu0Haojie Gao1Qiguang Miao2Yue Xi3Yunfeng Ai4Dingguo Gao5School of Computer Science and Technology, Xidian University, Xi’an 710071, ChinaGuangzhou Institute of Technology, Xidian University, Xi’an 510555, ChinaSchool of Computer Science and Technology, Xidian University, Xi’an 710071, ChinaGuangzhou Institute of Technology, Xidian University, Xi’an 510555, ChinaSchool of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, ChinaSchool of Information of Science and Technology, Tibet University, Lhasa 850000, ChinaInfrared and visible image fusion is to combine the information of thermal radiation and detailed texture from the two images into one informative fused image. Recently, deep learning methods have been widely applied in this task; however, those methods usually fuse multiple extracted features with the same fusion strategy, which ignores the differences in the representation of these features, resulting in the loss of information in the fusion process. To address this issue, we propose a novel method named multi-modal feature self-adaptive transformer (MFST) to preserve more significant information about the source images. Firstly, multi-modal features are extracted from the input images by a convolutional neural network (CNN). Then, these features are fused by the focal transformer blocks that can be trained through an adaptive fusion strategy according to the characteristics of different features. Finally, the fused features and saliency information of the infrared image are considered to obtain the fused image. The proposed fusion framework is evaluated on TNO, LLVIP, and FLIR datasets with various scenes. Experimental results demonstrate that our method outperforms several state-of-the-art methods in terms of subjective and objective evaluation.https://www.mdpi.com/2072-4292/14/13/3233infrared imagevisible imagetransformerimage fusionmulti-modal featurefocal self-attention
spellingShingle	Xiangzeng Liu Haojie Gao Qiguang Miao Yue Xi Yunfeng Ai Dingguo Gao MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion Remote Sensing infrared image visible image transformer image fusion multi-modal feature focal self-attention
title	MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_full	MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_fullStr	MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_full_unstemmed	MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_short	MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion
title_sort	mfst multi modal feature self adaptive transformer for infrared and visible image fusion
topic	infrared image visible image transformer image fusion multi-modal feature focal self-attention
url	https://www.mdpi.com/2072-4292/14/13/3233
work_keys_str_mv	AT xiangzengliu mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion AT haojiegao mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion AT qiguangmiao mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion AT yuexi mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion AT yunfengai mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion AT dingguogao mfstmultimodalfeatureselfadaptivetransformerforinfraredandvisibleimagefusion

MFST: Multi-Modal Feature Self-Adaptive Transformer for Infrared and Visible Image Fusion

Similar Items