Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model

Abstract With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal a...

Full description

Bibliographic Details
Main Authors: Yuanyuan Liu, Mengtao Yue, Han Yan, Lu Zhu
Format: Article
Language:English
Published: Wiley 2023-08-01
Series:IET Image Processing
Subjects:
Online Access:https://doi.org/10.1049/ipr2.12833
_version_ 1797755423847088128
author Yuanyuan Liu
Mengtao Yue
Han Yan
Lu Zhu
author_facet Yuanyuan Liu
Mengtao Yue
Han Yan
Lu Zhu
author_sort Yuanyuan Liu
collection DOAJ
description Abstract With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal and external information of the image. The authors add a lightweight Transformer structure to capture this information. Specifically, the authors apply a dense block structure and residual connection to build a residual dense convolution block (RDCB) that reduces the parameters somewhat and extracts shallow features. The lightweight transformer block (LTB) further extracts features and learns the texture details between the patches through the self‐attention mechanism. The LTB comprises an efficient multi‐head transformer (EMT) with small graphics processing unit (GPU) memory footprint, and benefits from feature preprocessing by multi‐head attention (MA), reduction, and expansion. The EMT significantly reduces the use of GPU resources. In addition, a detail‐purifying attention block (DAB) is proposed to explore the context information in the high‐resolution (HR) space to recover more details. Extensive evaluations of four benchmark datasets demonstrate the effectiveness of the authors’ proposed model in terms of quantitative metrics and visual effects. The proposed EMT only uses about 40% as much GPU memory as other methods, with better performance.
first_indexed 2024-03-12T17:47:39Z
format Article
id doaj.art-899e4cd6b7df45eb871bb78d38d40d27
institution Directory Open Access Journal
issn 1751-9659
1751-9667
language English
last_indexed 2024-03-12T17:47:39Z
publishDate 2023-08-01
publisher Wiley
record_format Article
series IET Image Processing
spelling doaj.art-899e4cd6b7df45eb871bb78d38d40d272023-08-03T12:43:17ZengWileyIET Image Processing1751-96591751-96672023-08-0117102881289310.1049/ipr2.12833Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid modelYuanyuan Liu0Mengtao Yue1Han Yan2Lu Zhu3School of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaSchool of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaSchool of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaSchool of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaAbstract With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal and external information of the image. The authors add a lightweight Transformer structure to capture this information. Specifically, the authors apply a dense block structure and residual connection to build a residual dense convolution block (RDCB) that reduces the parameters somewhat and extracts shallow features. The lightweight transformer block (LTB) further extracts features and learns the texture details between the patches through the self‐attention mechanism. The LTB comprises an efficient multi‐head transformer (EMT) with small graphics processing unit (GPU) memory footprint, and benefits from feature preprocessing by multi‐head attention (MA), reduction, and expansion. The EMT significantly reduces the use of GPU resources. In addition, a detail‐purifying attention block (DAB) is proposed to explore the context information in the high‐resolution (HR) space to recover more details. Extensive evaluations of four benchmark datasets demonstrate the effectiveness of the authors’ proposed model in terms of quantitative metrics and visual effects. The proposed EMT only uses about 40% as much GPU memory as other methods, with better performance.https://doi.org/10.1049/ipr2.12833image processingimage reconstructionimage resolution
spellingShingle Yuanyuan Liu
Mengtao Yue
Han Yan
Lu Zhu
Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
IET Image Processing
image processing
image reconstruction
image resolution
title Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
title_full Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
title_fullStr Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
title_full_unstemmed Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
title_short Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
title_sort single image super resolution using lightweight transformer convolutional neural network hybrid model
topic image processing
image reconstruction
image resolution
url https://doi.org/10.1049/ipr2.12833
work_keys_str_mv AT yuanyuanliu singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel
AT mengtaoyue singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel
AT hanyan singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel
AT luzhu singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel