Generative Modeling by Estimating Gradients of the Data Distribution

Song, Yang; Ermon, Stefano

doi:10.48550/arxiv.1907.05600

Cited by 74 publications

(97 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…4 were taken from. To address the problem of choosing a noise level in DSM (Saremi et al, 2018), Song & Ermon (2019) studied it with multiple noise levels by summing up the losses using a weighing scheme. See (Li et al, 2019;Chen et al, 2020;Kadkhodaie & Simoncelli, 2020;Jolicoeur-Martineau et al, 2020) in that direction.…”

Section: Resultsmentioning

confidence: 99%

Multimeasurement Generative Models

Saremi¹,

Srivastava²

2021

Preprint

View full text Add to dashboard Cite

We formally map the problem of sampling from an unknown distribution with density p X in R d to the problem of learning and sampling p Y in R M d obtained by convolving p X with a fixed factorial kernel: p Y is referred to as M-density and the factorial kernel as multimeasurement noise model (MNM). The M-density is smoother than p X , easier to learn and sample from, yet for large M the two problems are mathematically equivalent since X can be estimated exactly given Y = y using the Bayes estimator x(y) = E[X|Y = y]. To formulate the problem, we derive x(y) for Poisson and Gaussian MNMs expressed in closed form in terms of unnormalized p Y . This leads to a simple least-squares objective for learning parametric energy and score functions. We present various parametrization schemes of interest, including one in which studying Gaussian M-densities directly leads to multidenoising autoencoders-this is the first theoretical connection made between denoising autoencoders and empirical Bayes in the literature. Samples from p X are obtained by walk-jump sampling (Saremi & Hyvärinen, 2019) via underdamped Langevin MCMC (walk) to sample from p Y and the multimeasurement Bayes estimation of X (jump). We study permutation invariant Gaussian M-densities on MNIST, CIFAR-10, and FFHQ-256 datasets, and demonstrate the effectiveness of this framework for realizing fast-mixing stable Markov chains in high dimensions. INTRODUCTIONConsider a collection of i.i.d. samples {x i } n i=1 , assumed to have been drawn from an unknown distribution with density p X in R d . An important problem in probabilistic modeling is the task of drawing independent samples from p X , which has numerous potential applications. This problem is typically approached in two phases: approximating p X , and drawing samples from the approximated density. In unnormalized models the first phase is approached by learning the energy function f X associated with the Gibbs distribution p X ∝ exp(−f X ), and for the second phase one must resort to Markov chain Monte Carlo methods, such as Langevin MCMC, which are typically very slow to mix in high dimensions. MCMC sampling is considered an "art" and we do not have black box samplers that converge fast and are stable for complex (natural) distributions. The source of the problem is mainly attributed to the fact that the energy functions of interest are typically highly nonconvex.A broad sketch of our solution to this problem is to model a smoother density in an M-fold expanded space. The new density p(y), called M-density, is defined in R M d , where the bold y is a shorthand for (y 1 , . . . , y M ). M-density is smoother in the sense that its marginals p m (y m ) are obtained by the convolution p m (y m ) = p m (y m |x)p(x)dx with a smoothing kernel p m (y m |x) which for most of the paper we take to be the isotropic Gaussian:Although we bypass learning p(x), the new formalism allows for generating samples from p(x) since X can be estimated exactly given Y = y (for large M ). To give a physical picture, the approac...

show abstract

Section: Resultsmentioning

confidence: 99%

Multimeasurement Generative Models

Saremi¹,

Srivastava²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The denoising diffusion probabilistic model (DDPM) [15] is a latent variable model where a denoising autoencoder gradually transforms Gaussian noise into real signal. Scorebased generative model [39,40] trains a neural network to predict the score function which are used to draw samples via Langevin Dynamics. Collectively, these models have demonstrated comparable or superior image quality compared to GANs while exhibiting better mode coverage and training stability.…”

Section: Related Workmentioning

confidence: 99%

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

Liu¹,

Park²,

Azadi³

et al. 2021

Preprint

View full text Add to dashboard Cite

Controllable image synthesis models allow creation of diverse images based on text instructions or guidance from an example image. Recently, denoising diffusion probabilistic models have been shown to generate more realistic imagery than prior methods, and have been successfully demonstrated in unconditional and class-conditional settings. We explore fine-grained, continuous control of this model class, and introduce a novel unified framework for semantic diffusion guidance, which allows either language or image guidance, or both. Guidance is injected into a pretrained unconditional diffusion model using the gradient of image-text or image matching scores. We explore CLIPbased textual guidance as well as both content and stylebased image guidance in a unified form. Our text-guided synthesis approach can be applied to datasets without associated text annotations. We conduct experiments on FFHQ and LSUN datasets, and show results on fine-grained textguided image synthesis, synthesis of images related to a style or content example image, and examples with both textual and image guidance. 1 1 Project page xh-liu.github.io/sdg/

show abstract

“…where a parameterized model θ is estimated during the training phase. In fact, the model θ (x t , t) is just a scaled version of the score function s θ (x t , t) [38], which is the gradient of the log p θ (x t ). Once the model θ is trained, the data is sampled by the following stochastic generation step:…”

Section: Diffusion Probabilistic Modelmentioning

confidence: 99%

DiffuseMorph: Unsupervised Deformable Image Registration Along Continuous Trajectory Using Diffusion Models

Kim¹,

Han²,

Ye³

2021

Preprint

View full text Add to dashboard Cite

show abstract

Generative Modeling by Estimating Gradients of the Data Distribution

Cited by 74 publications

References 34 publications

Multimeasurement Generative Models

Multimeasurement Generative Models

More Control for Free! Image Synthesis with Semantic Diffusion Guidance

DiffuseMorph: Unsupervised Deformable Image Registration Along Continuous Trajectory Using Diffusion Models

Contact Info

Product

Resources

About