Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model
Abstract With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal a...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Wiley
2023-08-01
|
Series: | IET Image Processing |
Subjects: | |
Online Access: | https://doi.org/10.1049/ipr2.12833 |
_version_ | 1797755423847088128 |
---|---|
author | Yuanyuan Liu Mengtao Yue Han Yan Lu Zhu |
author_facet | Yuanyuan Liu Mengtao Yue Han Yan Lu Zhu |
author_sort | Yuanyuan Liu |
collection | DOAJ |
description | Abstract With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal and external information of the image. The authors add a lightweight Transformer structure to capture this information. Specifically, the authors apply a dense block structure and residual connection to build a residual dense convolution block (RDCB) that reduces the parameters somewhat and extracts shallow features. The lightweight transformer block (LTB) further extracts features and learns the texture details between the patches through the self‐attention mechanism. The LTB comprises an efficient multi‐head transformer (EMT) with small graphics processing unit (GPU) memory footprint, and benefits from feature preprocessing by multi‐head attention (MA), reduction, and expansion. The EMT significantly reduces the use of GPU resources. In addition, a detail‐purifying attention block (DAB) is proposed to explore the context information in the high‐resolution (HR) space to recover more details. Extensive evaluations of four benchmark datasets demonstrate the effectiveness of the authors’ proposed model in terms of quantitative metrics and visual effects. The proposed EMT only uses about 40% as much GPU memory as other methods, with better performance. |
first_indexed | 2024-03-12T17:47:39Z |
format | Article |
id | doaj.art-899e4cd6b7df45eb871bb78d38d40d27 |
institution | Directory Open Access Journal |
issn | 1751-9659 1751-9667 |
language | English |
last_indexed | 2024-03-12T17:47:39Z |
publishDate | 2023-08-01 |
publisher | Wiley |
record_format | Article |
series | IET Image Processing |
spelling | doaj.art-899e4cd6b7df45eb871bb78d38d40d272023-08-03T12:43:17ZengWileyIET Image Processing1751-96591751-96672023-08-0117102881289310.1049/ipr2.12833Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid modelYuanyuan Liu0Mengtao Yue1Han Yan2Lu Zhu3School of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaSchool of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaSchool of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaSchool of Information Engineering East China Jiaotong University Nanchang People's Republic of ChinaAbstract With constant advances in deep learning methods as applied to image processing, deep convolutional neural networks (CNNs) have been widely explored in single‐image super‐resolution (SISR) problems and have attained significant success. These CNN‐based methods cannot fully use the internal and external information of the image. The authors add a lightweight Transformer structure to capture this information. Specifically, the authors apply a dense block structure and residual connection to build a residual dense convolution block (RDCB) that reduces the parameters somewhat and extracts shallow features. The lightweight transformer block (LTB) further extracts features and learns the texture details between the patches through the self‐attention mechanism. The LTB comprises an efficient multi‐head transformer (EMT) with small graphics processing unit (GPU) memory footprint, and benefits from feature preprocessing by multi‐head attention (MA), reduction, and expansion. The EMT significantly reduces the use of GPU resources. In addition, a detail‐purifying attention block (DAB) is proposed to explore the context information in the high‐resolution (HR) space to recover more details. Extensive evaluations of four benchmark datasets demonstrate the effectiveness of the authors’ proposed model in terms of quantitative metrics and visual effects. The proposed EMT only uses about 40% as much GPU memory as other methods, with better performance.https://doi.org/10.1049/ipr2.12833image processingimage reconstructionimage resolution |
spellingShingle | Yuanyuan Liu Mengtao Yue Han Yan Lu Zhu Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model IET Image Processing image processing image reconstruction image resolution |
title | Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model |
title_full | Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model |
title_fullStr | Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model |
title_full_unstemmed | Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model |
title_short | Single‐image super‐resolution using lightweight transformer‐convolutional neural network hybrid model |
title_sort | single image super resolution using lightweight transformer convolutional neural network hybrid model |
topic | image processing image reconstruction image resolution |
url | https://doi.org/10.1049/ipr2.12833 |
work_keys_str_mv | AT yuanyuanliu singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel AT mengtaoyue singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel AT hanyan singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel AT luzhu singleimagesuperresolutionusinglightweighttransformerconvolutionalneuralnetworkhybridmodel |