2016
DOI: 10.48550/arxiv.1608.04471
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stein Variational Gradient Descent: A General Purpose Bayesian Inference Algorithm

Abstract: We propose a general purpose variational inference algorithm that forms a natural counterpart of gradient descent for optimization. Our method iteratively transports a set of particles to match the target distribution, by applying a form of functional gradient descent that minimizes the KL divergence. Empirical studies are performed on various real world models and datasets, on which our method is competitive with existing state-of-the-art methods. The derivation of our method is based on a new theoretical res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
53
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 85 publications
(54 citation statements)
references
References 14 publications
1
53
0
Order By: Relevance
“…Asymptotic approximate methods are not sampling-based and propose a specific form of the posterior like the Laplace method ( [44,22,43] )and the Integrated Nested Laplace Approximation (INLA) ( [39,4,45]) for Latent Gaussian models. Optimizationbased approximate methods like Variational Bayes (VB) ([3, 16, 7, 14]), Expectation Propagation (EP) ( [32,29,10]) and discrete distributions approximations by [25] and [31] are also popular.…”
Section: Introduction 1bayesian Methods and Statistical Machine Learningmentioning
confidence: 99%
“…Asymptotic approximate methods are not sampling-based and propose a specific form of the posterior like the Laplace method ( [44,22,43] )and the Integrated Nested Laplace Approximation (INLA) ( [39,4,45]) for Latent Gaussian models. Optimizationbased approximate methods like Variational Bayes (VB) ([3, 16, 7, 14]), Expectation Propagation (EP) ( [32,29,10]) and discrete distributions approximations by [25] and [31] are also popular.…”
Section: Introduction 1bayesian Methods and Statistical Machine Learningmentioning
confidence: 99%
“…Lastly, variational normalizing flow (VI-NF) (Rezende & Mohamed, 2015) is also included for comparison. We note the another line of popular sampling algorithm uses Stein-Variational Gradient Descent (SVGD) and particle-based variational inference approach (Liu & Wang, 2016;Liu et al, 2019). However, this type of method differs significantly from MCMC and rely on collections of interacting particle.…”
Section: Methodsmentioning
confidence: 99%
“…Figure 5: Generated samples from SVGD (Liu & Wang, 2016) with 100 steps. We generated samples with batch size 100,1000,5000.…”
Section: Particles 1000 Particles 5000 Particlesmentioning
confidence: 99%
“…Different from frequentists' method, Bayesian assumes a prior over the model and the uncertainty can be captured by the posterior. Bayesian inference have been largely popularized in machine learning, largely thanks to the recent development in scalable sampling method (Welling and Teh, 2011;Chen et al, 2014;Seita et al, 2016;Wu et al, 2020), variational inference (Blei et al, 2017Liu and Wang, 2016), and other approximation methods such as Gal and Ghahramani (2016); Lee et al (2018). In comparison, bootstrap has been much less widely used in modern machine learning and deep learning.…”
Section: Related Workmentioning
confidence: 99%
“…For example, in autonomous driving applications, our device can only store a limited number of models and we need to make decisions within a short time, which makes the standard bootstrap with a large number of models no more feasible. Typical ensemble methods in deep learning, such as Lakshminarayanan et al (2016); Huang et al (2017); Vyas et al (2018); Maddox et al (2019); Liu and Wang (2016), can only afford to use a small number (e.g., less than 20) of models.…”
Section: Introductionmentioning
confidence: 99%