A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models

Face verification and recognition are important tasks that have made great progress in recent years. However, recognizing low-resolution faces from small images is still a difficult problem. In this paper, we advocate using diffusion models (DMs) to enhance face resolution and improve their quality...

Full description

Bibliographic Details
Main Authors:	Juhao Gao, Ni Tang, Dongxiao Zhang
Format:	Article
Language:	English
Published:	MDPI AG 2023-07-01
Series:	Applied Sciences
Subjects:	super-resolution diffusion models U-Net back-projection
Online Access:	https://www.mdpi.com/2076-3417/13/14/8110

_version_	1797590444748570624
author	Juhao Gao Ni Tang Dongxiao Zhang
author_facet	Juhao Gao Ni Tang Dongxiao Zhang
author_sort	Juhao Gao
collection	DOAJ
description	Face verification and recognition are important tasks that have made great progress in recent years. However, recognizing low-resolution faces from small images is still a difficult problem. In this paper, we advocate using diffusion models (DMs) to enhance face resolution and improve their quality for various downstream applications. Most existing DMs for super-resolution use U-Net as their backbone network, which only exploits multi-scale features along the spatial dimension. These approaches result in a slow convergence of corresponding DMs and the inability to capture complex details and fine textures. To address this issue, we propose a novel conditional generative model based on DMs called BPSR3, which replaces the U-Net in super-resolution via repeated refinement (SR3) with a multi-scale deep back-projection network structure. BPSR3 can extract richer features not only in depth but also in breadth. This helps to effectively refine the image quality at different scales. The experimental results on facial datasets show that BPSR3 significantly improved both convergence speed and reconstruction performance. BPSR3 has about 1/4 of the parameters of SR3 but achieves a 50.1% improvement in PSNR, a 19.8% improvement in SSIM, and a 15.4% reduction in FID. Our contribution lies in achieving less time and space consumption and better reconstruction results. In addition, we propose an idea of enhancing the performance of DMs by replacing the U-Net with a better network.
first_indexed	2024-03-11T01:20:36Z
format	Article
id	doaj.art-5ea805fe02114c549754cc6d00fd4234
institution	Directory Open Access Journal
issn	2076-3417
language	English
last_indexed	2024-03-11T01:20:36Z
publishDate	2023-07-01
publisher	MDPI AG
record_format	Article
series	Applied Sciences
spelling	doaj.art-5ea805fe02114c549754cc6d00fd42342023-11-18T18:08:22ZengMDPI AGApplied Sciences2076-34172023-07-011314811010.3390/app13148110A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion ModelsJuhao Gao0Ni Tang1Dongxiao Zhang2School of Science, Jimei University, Xiamen 361021, ChinaSchool of Science, Jimei University, Xiamen 361021, ChinaSchool of Science, Jimei University, Xiamen 361021, ChinaFace verification and recognition are important tasks that have made great progress in recent years. However, recognizing low-resolution faces from small images is still a difficult problem. In this paper, we advocate using diffusion models (DMs) to enhance face resolution and improve their quality for various downstream applications. Most existing DMs for super-resolution use U-Net as their backbone network, which only exploits multi-scale features along the spatial dimension. These approaches result in a slow convergence of corresponding DMs and the inability to capture complex details and fine textures. To address this issue, we propose a novel conditional generative model based on DMs called BPSR3, which replaces the U-Net in super-resolution via repeated refinement (SR3) with a multi-scale deep back-projection network structure. BPSR3 can extract richer features not only in depth but also in breadth. This helps to effectively refine the image quality at different scales. The experimental results on facial datasets show that BPSR3 significantly improved both convergence speed and reconstruction performance. BPSR3 has about 1/4 of the parameters of SR3 but achieves a 50.1% improvement in PSNR, a 19.8% improvement in SSIM, and a 15.4% reduction in FID. Our contribution lies in achieving less time and space consumption and better reconstruction results. In addition, we propose an idea of enhancing the performance of DMs by replacing the U-Net with a better network.https://www.mdpi.com/2076-3417/13/14/8110super-resolutiondiffusion modelsU-Netback-projection
spellingShingle	Juhao Gao Ni Tang Dongxiao Zhang A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models Applied Sciences super-resolution diffusion models U-Net back-projection
title	A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models
title_full	A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models
title_fullStr	A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models
title_full_unstemmed	A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models
title_short	A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models
title_sort	multi scale deep back projection backbone for face super resolution with diffusion models
topic	super-resolution diffusion models U-Net back-projection
url	https://www.mdpi.com/2076-3417/13/14/8110
work_keys_str_mv	AT juhaogao amultiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT nitang amultiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT dongxiaozhang amultiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT juhaogao multiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT nitang multiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels AT dongxiaozhang multiscaledeepbackprojectionbackboneforfacesuperresolutionwithdiffusionmodels

A Multi-Scale Deep Back-Projection Backbone for Face Super-Resolution with Diffusion Models

Similar Items