2021
DOI: 10.48550/arxiv.2110.04814
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Finding Second-Order Stationary Points in Nonconvex-Strongly-Concave Minimax Optimization

Abstract: We study the smooth minimax optimization problem of the form minx maxy f (x, y), where the objective function is strongly-concave in y but possibly nonconvex in x. This problem includes a lot of applications in machine learning such as regularized GAN, reinforcement learning and adversarial training. Most of existing theory related to gradient descent accent focus on establishing the convergence result for achieving the first-order stationary point of f (x, y) or primal function P (x) maxy f (x, y). In this pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
3
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(3 citation statements)
references
References 18 publications
0
3
0
Order By: Relevance
“…However, all the previous works targeted finding stationary point of Φ(x). Very recently, [9,50] proposed cubic regularized GDA, a second-order algorithm that provably converges to a local minimum. [21] provided asymptotic results showing that GDA converges to local minimax point almost surely.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…However, all the previous works targeted finding stationary point of Φ(x). Very recently, [9,50] proposed cubic regularized GDA, a second-order algorithm that provably converges to a local minimum. [21] provided asymptotic results showing that GDA converges to local minimax point almost surely.…”
Section: Related Workmentioning
confidence: 99%
“…Remark 2.10 Compared with results of second-order methods escaping saddle points for minimax problem [9,50], our perturbed GDmax algorithm is purely first order, which means we do not need to compute Hessian-vector product. Moreover, algorithms in [9,50] require solving a nonconvex cubic sub-problem and multiple linear systems in each iteration. All these expensive computations are avoided in our perturbed GDmax algorithm, which makes it practical in real applications.…”
mentioning
confidence: 99%
See 1 more Smart Citation