A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models
Face verification and recognition are important tasks that have made great progress in recent years. However, recognizing low-resolution faces from small images is still a difficult problem. In this paper, we advocate using diffusion models (DMs) to enhance face resolution and improve their quality...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2023-07-01
|
Series: | Applied Sciences |
Subjects: | |
Online Access: | https://www.mdpi.com/2076-3417/13/14/8110 |
_version_ | 1797590444748570624 |
---|---|
author | Juhao Gao Ni Tang Dongxiao Zhang |
author_facet | Juhao Gao Ni Tang Dongxiao Zhang |
author_sort | Juhao Gao |
collection | DOAJ |
description | Face verification and recognition are important tasks that have made great progress in recent years. However, recognizing low-resolution faces from small images is still a difficult problem. In this paper, we advocate using diffusion models (DMs) to enhance face resolution and improve their quality for various downstream applications. Most existing DMs for super-resolution use U-Net as their backbone network, which only exploits multi-scale features along the spatial dimension. These approaches result in a slow convergence of corresponding DMs and the inability to capture complex details and fine textures. To address this issue, we propose a novel conditional generative model based on DMs called BPSR3, which replaces the U-Net in super-resolution via repeated refinement (SR3) with a multi-scale deep back-projection network structure. BPSR3 can extract richer features not only in depth but also in breadth. This helps to effectively refine the image quality at different scales. The experimental results on facial datasets show that BPSR3 significantly improved both convergence speed and reconstruction performance. BPSR3 has about 1/4 of the parameters of SR3 but achieves a 50.1% improvement in PSNR, a 19.8% improvement in SSIM, and a 15.4% reduction in FID. Our contribution lies in achieving less time and space consumption and better reconstruction results. In addition, we propose an idea of enhancing the performance of DMs by replacing the U-Net with a better network. |
first_indexed | 2024-03-11T01:20:36Z |
format | Article |
id | doaj.art-5ea805fe02114c549754cc6d00fd4234 |
institution | Directory Open Access Journal |
issn | 2076-3417 |
language | English |
last_indexed | 2024-03-11T01:20:36Z |
publishDate | 2023-07-01 |
publisher | MDPI AG |
record_format | Article |
series | Applied Sciences |
spelling | doaj.art-5ea805fe02114c549754cc6d00fd42342023-11-18T18:08:22ZengMDPI AGApplied Sciences2076-34172023-07-011314811010.3390/app13148110A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion ModelsJuhao Gao0Ni Tang1Dongxiao Zhang2School of Science, Jimei University, Xiamen 361021, ChinaSchool of Science, Jimei University, Xiamen 361021, ChinaSchool of Science, Jimei University, Xiamen 361021, ChinaFace verification and recognition are important tasks that have made great progress in recent years. However, recognizing low-resolution faces from small images is still a difficult problem. In this paper, we advocate using diffusion models (DMs) to enhance face resolution and improve their quality for various downstream applications. Most existing DMs for super-resolution use U-Net as their backbone network, which only exploits multi-scale features along the spatial dimension. These approaches result in a slow convergence of corresponding DMs and the inability to capture complex details and fine textures. To address this issue, we propose a novel conditional generative model based on DMs called BPSR3, which replaces the U-Net in super-resolution via repeated refinement (SR3) with a multi-scale deep back-projection network structure. BPSR3 can extract richer features not only in depth but also in breadth. This helps to effectively refine the image quality at different scales. The experimental results on facial datasets show that BPSR3 significantly improved both convergence speed and reconstruction performance. BPSR3 has about 1/4 of the parameters of SR3 but achieves a 50.1% improvement in PSNR, a 19.8% improvement in SSIM, and a 15.4% reduction in FID. Our contribution lies in achieving less time and space consumption and better reconstruction results. In addition, we propose an idea of enhancing the performance of DMs by replacing the U-Net with a better network.https://www.mdpi.com/2076-3417/13/14/8110super-resolutiondiffusion modelsU-Netback-projection |
spellingShingle | Juhao Gao Ni Tang Dongxiao Zhang A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models Applied Sciences super-resolution diffusion models U-Net back-projection |
title | A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models |
title_full | A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models |
title_fullStr | A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models |
title_full_unstemmed | A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models |
title_short | A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models |
title_sort | multi scale deep back projection backbone for face super resolution with diffusion models |
topic | super-resolution diffusion models U-Net back-projection |
url | https://www.mdpi.com/2076-3417/13/14/8110 |
work_keys_str_mv | AT juhaogao amultiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT nitang amultiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT dongxiaozhang amultiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT juhaogao multiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT nitang multiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT dongxiaozhang multiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels |