Image Deblurring Based on an Improved CNN-Transformer Combination Network

Recently, using a CNN has been a common practice to restore blurry images due to its strong ability to learn feature information from large-scale datasets. However, CNNs essentially belong to local operations and have the defect of a limited receptive field, which reduces the naturalness of deblurri...

Full description

Bibliographic Details
Main Authors: Xiaolin Chen, Yuanyuan Wan, Donghe Wang, Yuqing Wang
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/13/1/311
Description
Summary:Recently, using a CNN has been a common practice to restore blurry images due to its strong ability to learn feature information from large-scale datasets. However, CNNs essentially belong to local operations and have the defect of a limited receptive field, which reduces the naturalness of deblurring results. Moreover, CNN-based deblurring methods usually adopt many downsample operations, which hinder detail recovery. Fortunately, transformers focus on modeling the global features, so they can cooperate with CNNs to enlarge the receptive field and compensate for the details lost as well. In this paper, we propose an improved CNN-transformer combination network for deblurring, which adopts a coarse-to-fine architecture as the backbone. To extract the local features and global features simultaneously, the common methods are two blocks connected in parallel or cascaded. Different from these, we design a local-global feature combination block (LGFCB) with a new connecting structure to better use the extracted features. The LGFCB comprises multi-scale residual blocks (MRB) and a transformer block. In addition, we adopt a channel attention fusion block (CAFB) in the encoder path to integrate features. To improve the ability of feature representation, in the decoder path, we introduce a supervised attention block (SAB) operated on restoration images to refine features. Numerous experiments on GoPro and RealBlur datasets indicated that our model achieves remarkable accuracy and processing speed.
ISSN:2076-3417