2021
DOI: 10.48550/arxiv.2106.08929
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support

Abstract: We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergence between a moving source and a fixed target distribution. This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions. When using a Reproducing Kernel Hilbert Space (RKHS) to define the function class, we show that the KALE continuously interpolates between the KL and the Maximum Mean Discrepan… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…See, for example, Dupuis and Mao (2019); Birrell et al (2020a), who generalize φ-divergences to distributions that do not have common support. This idea is applied to GANs in Song and Ermon (2020); Glaser et al (2021). Domain adaptation and causal inference.…”
Section: Related Workmentioning
confidence: 99%
“…See, for example, Dupuis and Mao (2019); Birrell et al (2020a), who generalize φ-divergences to distributions that do not have common support. This idea is applied to GANs in Song and Ermon (2020); Glaser et al (2021). Domain adaptation and causal inference.…”
Section: Related Workmentioning
confidence: 99%
“…In such cases, new divergences could be considered e.g. Wasserstein metrics already studied in the DRO literature [53,12] or their Integral Probability Metrics (IPM) generalization [55]; alternatively we can consider various interpolations of divergences and IPMs studied recently in the machine learning literature such as [30,23,6,32] and references therein. For instance, the recently introduced (f, Γ)-divergences [6] are interpolations of f -divergences and IPMs that combine advantageous features of both, such as the capability to handle heavy-tailed data (property inherited from fdivergences) and to compare non-absolutely continuous distributions (inherited from IPMs).…”
Section: Correctability Of the Orr Bayesian Networkmentioning
confidence: 99%