On the Estimation of the Wasserstein Distance in Generative Models

Pinetz, Thomas; Soukup, Daniel; Pock, Thomas

doi:10.1007/978-3-030-33676-9_11

Cited by 7 publications

(14 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In particular, Mallasto et al (2019) shows that for simple discrete problems WGAN-GP tend to over estimate the Wasserstein 1 distance; statement 2 of our Theorem A provides some theoretical backing for this observation. A similar observation is noted in Pinetz et al (2019), who also provide evidence that the optimal value computed by WGAN-GP tends to converge to W 1 (µ, ν) only as λ → ∞, which agrees with the lower bound of statement 2 of Theorem A also.…”

Section: Related Worksupporting

confidence: 88%

“…More generally, several papers (e.g. Mallasto et al (2019), Pinetz et al (2019), Stanczuk et al (2021)) have provided empirical evidence that WGAN-GP do not compute W 1 (µ, ν), but do not offer a precise notion of what they do compute. In particular, Mallasto et al (2019) shows that for simple discrete problems WGAN-GP tend to over estimate the Wasserstein 1 distance; statement 2 of our Theorem A provides some theoretical backing for this observation.…”

Section: Related Workmentioning

confidence: 99%

“…While WGAN-GP have enjoyed spectacular success, the question of whether they are actually computing the Wasserstein 1 distance has only been studied more recently in, for example, Mallasto et al (2019), Pinetz et al (2019), and Stanczuk et al (2021). In particular we were intrigued by Stanczuk et al (2021), which offers empirical evidence that WGAN-GP do not compute W 1 (µ, ν), and, due to some issues with the Wasserstein 1 distance, argue that this might be the reason for their success.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Wasserstein GANs with Gradient Penalty Compute Congested Transport

Milne¹,

Nachman²

2021

Preprint

View full text Add to dashboard Cite

Wasserstein GANs with Gradient Penalty (WGAN-GP) are an extremely popular method for training generative models to produce high quality synthetic data. While WGAN-GP were initially developed to calculate the Wasserstein 1 distance between generated and real data, recent works (e.g. Stanczuk et al. (2021)) have provided empirical evidence that this does not occur, and have argued that WGAN-GP perform well not in spite of this issue, but because of it. In this paper we show for the first time that WGAN-GP compute the minimum of a different optimal transport problem, the so-called congested transport (Carlier et al. (2008)). Congested transport determines the cost of moving one distribution to another under a transport model that penalizes congestion. For WGAN-GP, we find that the congestion penalty has a spatially varying component determined by the sampling strategy used in Gulrajani et al. ( 2017) which acts like a local speed limit, making congestion cost less in some regions than others. This aspect of the congested transport problem is new in that the congestion penalty turns out to be unbounded and depend on the distributions to be transported, and so we provide the necessary mathematical proofs for this setting. We use our discovery to show that the gradients of solutions to the optimization problem in WGAN-GP determine the time averaged momentum of optimal mass flow. This is in contrast to the gradients of Kantorovich potentials for the Wasserstein 1 distance, which only determine the normalized direction of flow. This may explain, in support of Stanczuk et al. (2021), the success of WGAN-GP, since the training of the generator is based on these gradients.

show abstract

Section: Related Worksupporting

confidence: 88%

Section: Related Workmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Wasserstein GANs with Gradient Penalty Compute Congested Transport

Milne¹,

Nachman²

2021

Preprint

View full text Add to dashboard Cite

show abstract

“…As a result, a large number of studies have been devoted to finding a well defined objective function. In our implementation, we resorted to the Wasserstein or Earth's Mover's (EM) distance [34] which is implemented in Wasserstein GAN (WGAN [24,35]). The EM loss function is defined in the following way:…”

Section: The Ganpdfs Methodologymentioning

confidence: 99%

Compressing PDF sets using generative adversarial networks

2021

View full text Add to dashboard Cite

We present a compression algorithm for parton densities using synthetic replicas generated from the training of a generative adversarial network (GAN). The generated replicas are used to further enhance the statistics of a given Monte Carlo PDF set prior to compression. This results in a compression methodology that is able to provide a compressed set with smaller number of replicas and a more adequate representation of the original probability distribution. We also address the question of whether the GAN could be used as an alternative mechanism to avoid the fitting of large number of replicas.

show abstract

“…Although recent W-GANs provide state-of-the-art generative performance, however, it remains unclear to which extent this success is connected to OT. For example, [28,32,38] show that popular solvers for the Wasserstein-1 (W 1 ) distance in GANs fail to estimate W 1 accurately. While W-GANs were initially introduced Preprint.…”

mentioning

confidence: 99%

Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark

Korotin¹,

Li²,

Genevay³

et al. 2021

Preprint

View full text Add to dashboard Cite

Despite the recent popularity of neural network-based solvers for optimal transport (OT), there is no standard quantitative way to evaluate their performance. In this paper, we address this issue for quadratic-cost transport-specifically, computation of the Wasserstein-2 distance, a commonly-used formulation of optimal transport in machine learning. To overcome the challenge of computing ground truth transport maps between continuous measures needed to assess these solvers, we use inputconvex neural networks (ICNN) to construct pairs of measures whose ground truth OT maps can be obtained analytically. This strategy yields pairs of continuous benchmark measures in high-dimensional spaces such as spaces of images. We thoroughly evaluate existing optimal transport solvers using these benchmark measures. Even though these solvers perform well in downstream tasks, many do not faithfully recover optimal transport maps. To investigate the cause of this discrepancy, we further test the solvers in a setting of image generation. Our study reveals crucial limitations of existing solvers and shows that increased OT accuracy does not necessarily correlate to better results downstream.Solving optimal transport (OT) with continuous methods has become widespread in machine learning, including methods for large-scale OT [11,36] and the popular Wasserstein Generative Adversarial Network (W-GAN) [3,12]. Rather than discretizing the problem [31], continuous OT algorithms use neural networks or kernel expansions to estimate transport maps or dual solutions. This helps scale OT to large-scale and higher-dimensional problems not handled by discrete methods. Notable successes of continuous OT are in generative modeling [42,20,19,7] and domain adaptation [43,37,25].In these applications, OT is typically incorporated as part of the loss terms for a neural network model. For example, in W-GANs, the OT cost is used as a loss function for the generator; the model incorporates a neural network-based OT solver to estimate the loss. Although recent W-GANs provide state-of-the-art generative performance, however, it remains unclear to which extent this success is connected to OT. For example, [28,32,38] show that popular solvers for the Wasserstein-1 (W 1 ) distance in GANs fail to estimate W 1 accurately. While W-GANs were initially introduced Preprint.

show abstract

On the Estimation of the Wasserstein Distance in Generative Models

Cited by 7 publications

References 23 publications

Wasserstein GANs with Gradient Penalty Compute Congested Transport

Wasserstein GANs with Gradient Penalty Compute Congested Transport

Compressing PDF sets using generative adversarial networks

Do Neural Optimal Transport Solvers Work? A Continuous Wasserstein-2 Benchmark

Contact Info

Product

Resources

About