Exploiting diffusion prior for real-world image super-resolution

We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby pre...

Full description

Bibliographic Details
Main Authors:	Wang, Jianyi, Yue, Zongsheng, Zhou, Shangchen, Chan, Kelvin C. K., Loy, Chen Change
Other Authors:	College of Computing and Data Science
Format:	Journal Article
Language:	English
Published:	2024
Subjects:	Computer and Information Science Image restoration Diffusion models
Online Access:	https://hdl.handle.net/10356/180685

_version_	1824455903971115008
author	Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change
author2	College of Computing and Data Science
author_facet	College of Computing and Data Science Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change
author_sort	Wang, Jianyi
collection	NTU
description	We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby preserving the generative prior and minimizing training cost. To remedy the loss of fidelity caused by the inherent stochasticity of diffusion models, we employ a controllable feature wrapping module that allows users to balance quality and fidelity by simply adjusting a scalar value during the inference process. Moreover, we develop a progressive aggregation sampling strategy to overcome the fixed-size constraints of pre-trained diffusion models, enabling adaptation to resolutions of any size. A comprehensive evaluation of our method using both synthetic and real-world benchmarks demonstrates its superiority over current state-of-the-art approaches. Code and models are available at https://github.com/IceClear/StableSR.
first_indexed	2025-02-19T03:45:37Z
format	Journal Article
id	ntu-10356/180685
institution	Nanyang Technological University
language	English
last_indexed	2025-02-19T03:45:37Z
publishDate	2024
record_format	dspace
spelling	ntu-10356/1806852024-10-21T02:09:42Z Exploiting diffusion prior for real-world image super-resolution Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change College of Computing and Data Science S-Lab Computer and Information Science Image restoration Diffusion models We present a novel approach to leverage prior knowledge encapsulated in pre-trained text-to-image diffusion models for blind super-resolution. Specifically, by employing our time-aware encoder, we can achieve promising restoration results without altering the pre-trained synthesis model, thereby preserving the generative prior and minimizing training cost. To remedy the loss of fidelity caused by the inherent stochasticity of diffusion models, we employ a controllable feature wrapping module that allows users to balance quality and fidelity by simply adjusting a scalar value during the inference process. Moreover, we develop a progressive aggregation sampling strategy to overcome the fixed-size constraints of pre-trained diffusion models, enabling adaptation to resolutions of any size. A comprehensive evaluation of our method using both synthetic and real-world benchmarks demonstrates its superiority over current state-of-the-art approaches. Code and models are available at https://github.com/IceClear/StableSR. National Research Foundation (NRF) This study is supported by the National Research Foundation, Singapore under its AI Singapore Programme (AISG Award No: AISG2-PhD-2022-01-033[T]), RIE2020 Industry Alignment Fund Industry Collaboration Projects (IAF-ICP) Funding Initiative, as well as cash and in-kind contribution from the industry partner(s). We sincerely thank Yi Li for providing valuable advice and building the WebUI implementation (https://github.com/pkuliyi2015/ sd-webui-stablesr) of our work. We also thank the continuous interest and contributions from the community. 2024-10-21T02:09:42Z 2024-10-21T02:09:42Z 2024 Journal Article Wang, J., Yue, Z., Zhou, S., Chan, K. C. K. & Loy, C. C. (2024). Exploiting diffusion prior for real-world image super-resolution. International Journal of Computer Vision. https://dx.doi.org/10.1007/s11263-024-02168-7 0920-5691 https://hdl.handle.net/10356/180685 10.1007/s11263-024-02168-7 2-s2.0-85198058630 en RIE2020 AISG2-PhD-2022-01-033[T] International Journal of Computer Vision © 2024 The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature. All rights reserved.
spellingShingle	Computer and Information Science Image restoration Diffusion models Wang, Jianyi Yue, Zongsheng Zhou, Shangchen Chan, Kelvin C. K. Loy, Chen Change Exploiting diffusion prior for real-world image super-resolution
title	Exploiting diffusion prior for real-world image super-resolution
title_full	Exploiting diffusion prior for real-world image super-resolution
title_fullStr	Exploiting diffusion prior for real-world image super-resolution
title_full_unstemmed	Exploiting diffusion prior for real-world image super-resolution
title_short	Exploiting diffusion prior for real-world image super-resolution
title_sort	exploiting diffusion prior for real world image super resolution
topic	Computer and Information Science Image restoration Diffusion models
url	https://hdl.handle.net/10356/180685
work_keys_str_mv	AT wangjianyi exploitingdiffusionpriorforrealworldimagesuperresolution AT yuezongsheng exploitingdiffusionpriorforrealworldimagesuperresolution AT zhoushangchen exploitingdiffusionpriorforrealworldimagesuperresolution AT chankelvinck exploitingdiffusionpriorforrealworldimagesuperresolution AT loychenchange exploitingdiffusionpriorforrealworldimagesuperresolution

Exploiting diffusion prior for real-world image super-resolution

Similar Items