2021
DOI: 10.48550/arxiv.2109.00528
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Wasserstein GANs with Gradient Penalty Compute Congested Transport

Abstract: Wasserstein GANs with Gradient Penalty (WGAN-GP) are an extremely popular method for training generative models to produce high quality synthetic data. While WGAN-GP were initially developed to calculate the Wasserstein 1 distance between generated and real data, recent works (e.g. Stanczuk et al. (2021)) have provided empirical evidence that this does not occur, and have argued that WGAN-GP perform well not in spite of this issue, but because of it. In this paper we show for the first time that WGAN-GP comput… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

1
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 11 publications
1
1
0
Order By: Relevance
“…In practice we found that this value of λ stabilizes the estimates of W 1 (µ n , ν) and the training of TTC; using smaller values of λ leads to inflated estimates of W 1 (µ n , ν), leading to overly large step sizes and unstable training. This is confirmed by the recent analysis in [30], which shows that at best the value in (15), in expectation, converges to W 1 (µ n , ν) like O(λ −1 ).…”
Section: The Algorithmsupporting
confidence: 74%
See 1 more Smart Citation
“…In practice we found that this value of λ stabilizes the estimates of W 1 (µ n , ν) and the training of TTC; using smaller values of λ leads to inflated estimates of W 1 (µ n , ν), leading to overly large step sizes and unstable training. This is confirmed by the recent analysis in [30], which shows that at best the value in (15), in expectation, converges to W 1 (µ n , ν) like O(λ −1 ).…”
Section: The Algorithmsupporting
confidence: 74%
“…To start with, our theoretical analysis relies on being able to compute a Kantorovich potential u 0 for the pair (µ, ν). Given recent results in [30] showing that the optimization problem for learning critics from [17] actually returns a function that solves a congested transport problem, which is distinct from a Kantorovich potential, one may ask to what extent our theoretical analysis applies to the types of critics learned in practice. The distinction between these problems decreases as λ gets large, so we suspect that our assumption of being able to compute a Kantorovich potential is reasonable for the λ value we used (λ = 1000), but this is certainly something to consider, especially for the smaller values of λ typically used in the literature.…”
Section: Discussion Of Limitations and Societal Impactmentioning
confidence: 99%