2018
DOI: 10.1007/s41884-018-0015-3
|View full text |Cite
|
Sign up to set email alerts
|

Natural gradient via optimal transport

Abstract: We study a natural Wasserstein gradient flow on manifolds of probability distributions with discrete sample spaces. We derive the Riemannian structure for the probability simplex from the dynamical formulation of the Wasserstein distance on a weighted graph. We pull back the geometric structure to the parameter space of any given probability model, which allows us to define a natural gradient flow there. In contrast to the natural Fisher-Rao gradient, the natural Wasserstein gradient incorporates a ground metr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
41
0
1

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2

Relationship

4
3

Authors

Journals

citations
Cited by 54 publications
(42 citation statements)
references
References 39 publications
0
41
0
1
Order By: Relevance
“…Solutions to the Fokker-Planck equation are gradient flows of the relative entropy in the density manifold (Otto, 2001;Jordan, Kinderlehrer and Otto, 1998). Designing time-stepping methods which preserve gradient structure is also of current interest: see (Pathiraja and Reich, 2019) and, within the context of Wasserstein gradient flows, (Li and Montufar, 2018;Tong Lin et al, 2018;Li, Lin and Montúfar, 2019). The subject of discrete gradients for time-integration of gradient and Hamiltonian systems is developed in (Humphries and Stuart, 1994;Gonzalez, 1996;McLachlan, Quispel and Robidoux, 1999;Hairer and Lubich, 2013).…”
Section: Literature Reviewmentioning
confidence: 99%
“…Solutions to the Fokker-Planck equation are gradient flows of the relative entropy in the density manifold (Otto, 2001;Jordan, Kinderlehrer and Otto, 1998). Designing time-stepping methods which preserve gradient structure is also of current interest: see (Pathiraja and Reich, 2019) and, within the context of Wasserstein gradient flows, (Li and Montufar, 2018;Tong Lin et al, 2018;Li, Lin and Montúfar, 2019). The subject of discrete gradients for time-integration of gradient and Hamiltonian systems is developed in (Humphries and Stuart, 1994;Gonzalez, 1996;McLachlan, Quispel and Robidoux, 1999;Hairer and Lubich, 2013).…”
Section: Literature Reviewmentioning
confidence: 99%
“…Then G is Riemmanian metric on T Θ iff For each θ ∈ Θ, for any ξ ∈ T θ Θ (ξ = 0), we can find x ∈ M such that ∇ · (ρ θ ∂ θ T θ (T −1 θ (x)ξ) = 0. From now on, following [9,10], we call (Θ, G) Wasserstein statistical manifold.…”
Section: Parameter Space Equipped With Wasserstein Metricmentioning
confidence: 99%
“…It treats the equation as the gradient flow of relative entropy on probability manifold equipped with Wasserstein metric [5,14]. Recently, the studies have been extended to information geometry [1,2,3], creating a new area known as Wasserstein information geometry [7,9,10]. Inspired by those studies, in this paper, we derive the metric tensor on parameter space by pulling back the Wasserstein metric via the parameterized pushforward map.…”
Section: Introductionmentioning
confidence: 99%
“…The Wasserstein metric tensor of a statistical manifold (a parametrized set of probability densities) has been defined in [16]. A statistical manifold endowed with a Wasserstein metric tensor structure is called Wasserstein statistical manifold.…”
Section: Introductionmentioning
confidence: 99%
“…We define the Ricci curvature lower bound via geodesic convexity of the KL divergence on a Wasserstein statistical manifold. We obtain a definition of the Ricci-curvature that connects Wasserstein geometry [24] and information geometry [1,2], much in the spirit of [15,16], and take a natural further step towards connecting the two fields, in particular, relating notions from learning applications and the geometry of the statistical models. We focus on discrete sample spaces, which allows us to present a clear picture of the relations deriving from this theory, and leave the details of continuous settings for future work.…”
Section: Introductionmentioning
confidence: 99%