2022
DOI: 10.1088/2632-2153/ac9455
|View full text |Cite
|
Sign up to set email alerts
|

Gradients should stay on path: better estimators of the reverse- and forward KL divergence for normalizing flows

Abstract: We propose an algorithm to estimate the path-gradient of both the reverse and forward Kullback--Leibler divergence for an arbitrary manifestly invertible normalizing flow. The resulting path-gradient estimators are straightforward to implement, have lower variance, and lead not only to faster convergence of training but also to better overall approximation results compared to standard total gradient estimators. We also demonstrate that path-gradient training is less susceptible to mode-collapse. In light of ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(10 citation statements)
references
References 36 publications
0
10
0
Order By: Relevance
“…Continuous Normalizing Flows represent a particular implementation of such architectures: they can be constructed setting f to be the solution of a Neural Ordinary Differential Equation (NODE) [41,42,50,51]…”
Section: Continuous Normalizing Flowsmentioning
confidence: 99%
See 1 more Smart Citation
“…Continuous Normalizing Flows represent a particular implementation of such architectures: they can be constructed setting f to be the solution of a Neural Ordinary Differential Equation (NODE) [41,42,50,51]…”
Section: Continuous Normalizing Flowsmentioning
confidence: 99%
“…One possible application of such models is to sample configurations from Boltzmann distributions: this approach found successful application in toy models of lattice field theory . While standard NFs still suffer from poor scaling [49], a generalization of these algorithms called Continuous Normalizing Flows (CNFs) [50] has been used to obtain interesting results in lattice scalar field theory [41,42,51].…”
Section: Introductionmentioning
confidence: 99%
“…We use reverse KL self-training [38,52,53] with path gradients [34,54] to optimize the models. Note that the base distributions here are pure gauge theories at 𝛽 > 0, rather than Haar uniform (𝛽 = 0).…”
Section: Pos(lattice2023)011mentioning
confidence: 99%
“…[1] used normalizing flows to sample configurations of scalar field theories and thereby paved the way for the development of Boltzmann generators in lattice field theory. In the following years, numerous follow-up studies have been conducted with applications in scalar [1][2][3][4][5][6][7][8][9][10][11][12][13][14][15][16], pure gauge [17][18][19][20][21][22] and fermionic [23,24] field theories and beyond [25]. Notably, the number of works in the field started to substantially increase upon the release of a jupyter notebook [26], where the code from three important papers [1,17,35] was partly released.…”
Section: Theoretical Backgroundmentioning
confidence: 99%