IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer
The current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency inform...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-02-01
|
Series: | Remote Sensing |
Subjects: | |
Online Access: | https://www.mdpi.com/2072-4292/15/5/1352 |
_version_ | 1827752225804386304 |
---|---|
author | Zhang Xiong Xiaohui Zhang Qingping Hu Hongwei Han |
author_facet | Zhang Xiong Xiaohui Zhang Qingping Hu Hongwei Han |
author_sort | Zhang Xiong |
collection | DOAJ |
description | The current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency information, we designed the IFormer mixer, which splits the input features through the channel dimension and feeds them into parallel paths for high- and low-frequency mixers to achieve linear computational complexity. The high-frequency mixer adopts a convolution and a max-pooling path, while the low-frequency mixer adopts a criss-cross attention path. Considering that the high-frequency information relates to the texture detail, we designed a cross-domain frequency fusion strategy, which trades high-frequency information between the source images. This structure can sufficiently integrate complementary features and strengthen the capability of texture retaining. Experiments on the TNO, OSU, and Road Scene datasets demonstrate that IFormerFusion outperforms other methods in object and subject evaluations. |
first_indexed | 2024-03-11T07:11:42Z |
format | Article |
id | doaj.art-9beca0e723794342aea06ba0781174c8 |
institution | Directory Open Access Journal |
issn | 2072-4292 |
language | English |
last_indexed | 2024-03-11T07:11:42Z |
publishDate | 2023-02-01 |
publisher | MDPI AG |
record_format | Article |
series | Remote Sensing |
spelling | doaj.art-9beca0e723794342aea06ba0781174c82023-11-17T08:31:54ZengMDPI AGRemote Sensing2072-42922023-02-01155135210.3390/rs15051352IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception TransformerZhang Xiong0Xiaohui Zhang1Qingping Hu2Hongwei Han3Department of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaDepartment of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaDepartment of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaDepartment of Weapon Engineering, Naval University of Engineering, Wuhan 430030, ChinaThe current deep learning-based image fusion methods can not sufficiently learn the features of images in a wide frequency range. Therefore, we proposed IFormerFusion, which is based on the Inception Transformer and cross-domain frequency fusion. To learn features from high- and low-frequency information, we designed the IFormer mixer, which splits the input features through the channel dimension and feeds them into parallel paths for high- and low-frequency mixers to achieve linear computational complexity. The high-frequency mixer adopts a convolution and a max-pooling path, while the low-frequency mixer adopts a criss-cross attention path. Considering that the high-frequency information relates to the texture detail, we designed a cross-domain frequency fusion strategy, which trades high-frequency information between the source images. This structure can sufficiently integrate complementary features and strengthen the capability of texture retaining. Experiments on the TNO, OSU, and Road Scene datasets demonstrate that IFormerFusion outperforms other methods in object and subject evaluations.https://www.mdpi.com/2072-4292/15/5/1352image fusiontransformerinception transformerinfrared imagevisible image |
spellingShingle | Zhang Xiong Xiaohui Zhang Qingping Hu Hongwei Han IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer Remote Sensing image fusion transformer inception transformer infrared image visible image |
title | IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer |
title_full | IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer |
title_fullStr | IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer |
title_full_unstemmed | IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer |
title_short | IFormerFusion: Cross-Domain Frequency Information Learning for Infrared and Visible Image Fusion Based on the Inception Transformer |
title_sort | iformerfusion cross domain frequency information learning for infrared and visible image fusion based on the inception transformer |
topic | image fusion transformer inception transformer infrared image visible image |
url | https://www.mdpi.com/2072-4292/15/5/1352 |
work_keys_str_mv | AT zhangxiong iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer AT xiaohuizhang iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer AT qingpinghu iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer AT hongweihan iformerfusioncrossdomainfrequencyinformationlearningforinfraredandvisibleimagefusionbasedontheinceptiontransformer |