DOI: 10.1007/978-3-540-69162-4_32
|View full text |Cite
|
Sign up to set email alerts
|

Natural Conjugate Gradient in Variational Inference

Abstract: Summary:Variational methods for approximate inference in machine learning often adapt a parametric probability distribution to optimize a given objective function. This view is especially useful when applying variational Bayes (VB) to models outside the conjugate-exponential family. For them, variational Bayesian expectation maximization (VB EM) algorithms are not easily available, and gradientbased methods are often used as alternatives. Traditional natural gradient methods use the Riemannian structure (or ge… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
24
0
1

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 29 publications
(26 citation statements)
references
References 9 publications
1
24
0
1
Order By: Relevance
“…Variational inference for the model basically follows the scheme introduced in [3,10]: we derive a deterministic approximation of the free energy based on a fixed functional form of the posterior approximation and the apply gradient-based optimisation to minimise the free energy. The optimisation is made more efficient through the use of natural conjugate gradient (NCG) optimisation [11].…”
Section: Variational Inferencementioning
confidence: 99%
“…Variational inference for the model basically follows the scheme introduced in [3,10]: we derive a deterministic approximation of the free energy based on a fixed functional form of the posterior approximation and the apply gradient-based optimisation to minimise the free energy. The optimisation is made more efficient through the use of natural conjugate gradient (NCG) optimisation [11].…”
Section: Variational Inferencementioning
confidence: 99%
“…Our approach, proposed in [6], is that natural gradient could be applied to update both the model parameters and the latent variables by using the geometry of the variational Bayesian approximation q(θ, Z|ξ). The matrix inversion required for the evaluation of the natural gradient in (14) would be prohibitively expensive if the full matrix had to be inverted.…”
Section: Natural Gradient Descentmentioning
confidence: 99%
“…The matrix inversion required for the evaluation of the natural gradient in (14) would be prohibitively expensive if the full matrix had to be inverted. Luckily, because of the typical factorizing approximation of q, the matrix G is block diagonal [6] without further approximations.…”
Section: Natural Gradient Descentmentioning
confidence: 99%
See 2 more Smart Citations