2020
DOI: 10.48550/arxiv.2006.09033
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Two steps at a time -- taking GAN training in stride with Tseng's method

Abstract: Motivated by the training of Generative Adversarial Networks (GANs), we study methods for solving minimax problems with additional nonsmooth regularizers. We do so by employing monotone operator theory, in particular the Forward-Backward-Forward (FBF) method, which avoids the known issue of limit cycling by correcting each update by a second gradient evaluation. Furthermore, we propose a seemingly new scheme which recycles old gradients to mitigate the additional computational cost. In doing so we rediscover a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 19 publications
0
6
0
Order By: Relevance
“…But when not, FBF applied to the VI requires one proximal operator every iteration, whereas extragradient requires two. This advantage can be important for the cases where proximal operator is computationally expensive [Böh+20]. We have essentially the same convergence result as for extragradient in Section 2.3.…”
Section: Forward-backward-forward With Variance Reductionmentioning
confidence: 63%
“…But when not, FBF applied to the VI requires one proximal operator every iteration, whereas extragradient requires two. This advantage can be important for the cases where proximal operator is computationally expensive [Böh+20]. We have essentially the same convergence result as for extragradient in Section 2.3.…”
Section: Forward-backward-forward With Variance Reductionmentioning
confidence: 63%
“…It is well known that while OGDA requires only half as many gradient evaluations per iterations compared to EG, it also asks for a smaller stepsize, see [6,46]. In the monotone (convex-concave) setting typical bounds on the stepsize are 1/L and 1/(2L) for EG and OGDA, respectively, see [6,46]. The downside of a smaller stepsize is typically a worse constant in the convergence rate.…”
Section: Ogda For Problems With Weak Minty Solutionsmentioning
confidence: 99%
“…Note that OGDA is most commonly written in the form where β = 1 is used, see [6,17,43], with the exception of two recent works which have investigated a more general coefficient see [45,55]. However, there, and for related methods [11,28,31], the parameter β is usually restricted to be less than 1.…”
Section: Introductionmentioning
confidence: 99%
“…The papers [68] and [54] consider stochastic variants of three-operator splitting, but they can only be applied to optimization problems. The methods of [70] and [7] can be applied to simple saddle-point problems involving a single regularizer.…”
Section: Related Workmentioning
confidence: 99%