IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer

The current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency inform...

Full description

Bibliographic Details
Main Authors: Zhang Xiong, Xiaohui Zhang, Qingping Hu, Hongwei Han
Format: Article
Language:English
Published: MDPI AG 2023-02-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/15/5/1352
_version_ 1827752225804386304
author Zhang Xiong
Xiaohui Zhang
Qingping Hu
Hongwei Han
author_facet Zhang Xiong
Xiaohui Zhang
Qingping Hu
Hongwei Han
author_sort Zhang Xiong
collection DOAJ
description The current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency information, we designed the IFormer mixer, which splits the input features through the channel dimension and feeds them into parallel paths for high- and low-frequency mixers to achieve linear computational complexity. The high-frequency mixer adopts a convolution and a max-pooling path, while the low-frequency mixer adopts a criss-cross attention path. Considering that the high-frequency information relates to the texture detail, we designed a cross-domain frequency fusion strategy, which trades high-frequency information between the source images. This structure can sufficiently integrate complementary features and strengthen the capability of texture retaining. Experiments on the TNO, OSU, and Road Scene datasets demonstrate that IFormerFusion outperforms other methods in object and subject evaluations.
first_indexed 2024-03-11T07:11:42Z
format Article
id doaj.art-9beca0e723794342aea06ba0781174c8
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-11T07:11:42Z
publishDate 2023-02-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-9beca0e723794342aea06ba0781174c82023-11-17T08:31:54ZengMDPI AGRemote Sensing2072-42922023-02-01155135210.3390/rs15051352IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception TransformerZhang Xiong0Xiaohui Zhang1Qingping Hu2Hongwei Han3Department of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaDepartment of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaDepartment of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaDepartment of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaThe current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency information, we designed the IFormer mixer, which splits the input features through the channel dimension and feeds them into parallel paths for high- and low-frequency mixers to achieve linear computational complexity. The high-frequency mixer adopts a convolution and a max-pooling path, while the low-frequency mixer adopts a criss-cross attention path. Considering that the high-frequency information relates to the texture detail, we designed a cross-domain frequency fusion strategy, which trades high-frequency information between the source images. This structure can sufficiently integrate complementary features and strengthen the capability of texture retaining. Experiments on the TNO, OSU, and Road Scene datasets demonstrate that IFormerFusion outperforms other methods in object and subject evaluations.https://www.mdpi.com/2072-4292/15/5/1352image fusiontransformerinception transformerinfrared imagevisible image
spellingShingle Zhang Xiong
Xiaohui Zhang
Qingping Hu
Hongwei Han
IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
Remote Sensing
image fusion
transformer
inception transformer
infrared image
visible image
title IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
title_full IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
title_fullStr IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
title_full_unstemmed IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
title_short IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
title_sort iformerfusion cross domain frequency information learning for infrared and visible image fusion based on the inception transformer
topic image fusion
transformer
inception transformer
infrared image
visible image
url https://www.mdpi.com/2072-4292/15/5/1352
work_keys_str_mv AT zhangxiong iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer
AT xiaohuizhang iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer
AT qingpinghu iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer
AT hongweihan iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer