2022
DOI: 10.1109/tpami.2022.3204461
|View full text |Cite
|
Sign up to set email alerts
|

Image Super-Resolution Via Iterative Refinement

Abstract: We present SR3, an approach to image Super-Resolution via Repeated Refinement. SR3 adapts denoising diffusion probabilistic models [1], [2] to image-to-image translation, and performs super-resolution through a stochastic iterative denoising process. Output images are initialized with pure Gaussian noise and iteratively refined using a U-Net architecture that is trained on denoising at various noise levels, conditioned on a low-resolution input image. SR3 exhibits strong performance on super-resolution tasks a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
332
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 712 publications
(332 citation statements)
references
References 43 publications
0
332
0
Order By: Relevance
“…This method can be generalized for learning conditional distributions -by augmenting the denoising network with an auxiliary input y, the network f θ px t , t, yq and its resulting diffusion process can faithfully sample from a data distribution conditioned on y. The conditioning input y can be a low-resolution version of the desired image [47] or a class label [19]. Furthermore, y can also be on a text sequence describing the desired image [40,43,46].…”
Section: Preliminariesmentioning
confidence: 99%
“…This method can be generalized for learning conditional distributions -by augmenting the denoising network with an auxiliary input y, the network f θ px t , t, yq and its resulting diffusion process can faithfully sample from a data distribution conditioned on y. The conditioning input y can be a low-resolution version of the desired image [47] or a class label [19]. Furthermore, y can also be on a text sequence describing the desired image [40,43,46].…”
Section: Preliminariesmentioning
confidence: 99%
“…Some modern video SR models also leverage the generative networks to compensate spatial-temporal coherence across frames, e.g., TecoGAN [8], PULSE [36] and Real-ESRGAN [52]. Recently, the diffusion probabilistic models (e.g., DDPM [17], DDIM [49] and LDM [42]) have achieved impressive performance in diverse generative tasks, including inpainting [34], colorization [50] and image synthesis [43]. Inspired by the latest research progress of neural-enhancing models, we intend to conduct the encoder-decoder (i.e., codec) synergy by leveraging the visual-synthesis genius of diffusion models.…”
Section: Neural-enhancing Modelsmentioning
confidence: 99%
“…As lowquality videos are received by the ingest server, the decoder can capture all the compressed frames and upscale them to original resolution by using the fast bilinear interpolation. Following the mainstream mechanism to handle the conditioning [43], the decoder initializes a Gaussian noise as the generative seed and concatenates the upscaled frames with it along the channel dimension. The diffusion model takes the concatenation result to generate high-quality frames.…”
Section: Distortion-aware Conditioningmentioning
confidence: 99%
See 1 more Smart Citation
“…Diffusion models (DMs) [11,47,48,53] are deep generative models that have been gaining attention in recent years. DMs have achieved state-of-the-art performance in several tasks involving conditional image generation [4,39,49], image super resolution [40], image colorization [38], and other related tasks [6,16,33,41]. In addition, recently proposed latent diffusion models (LDMs) [37] have further reduced computational costs by utilizing the latent space generated by their autoencoding component, enabling more efficient computations in the training and inference phases.…”
Section: Introductionmentioning
confidence: 99%