2022
DOI: 10.48550/arxiv.2203.04850
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Federated Minimax Optimization: Improved Convergence Analyses and Algorithms

Abstract: In this paper, we consider nonconvex minimax optimization, which is gaining prominence in many modern machine learning applications such as GANs. Large-scale edge-based collection of training data in these applications calls for communication-efficient distributed optimization algorithms, such as those used in federated learning, to process the data. In this paper, we analyze Local stochastic gradient descent ascent (SGDA), the local-update version of the SGDA algorithm. SGDA is the core algorithm used in mini… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
4

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 15 publications
0
6
0
Order By: Relevance
“…Otherwise, at a cost of increased gradient complexity, each device can query the oracle O(1/ 2 ) times every round, average the results and make the stochastic gradient variance O(1/ 2 ). This procedure make the bound vanishing and leads to a gradient complexity matching the one of [43] given for the federated learning scenario. Fig.…”
Section: B Non-convex Loss Functionmentioning
confidence: 98%
“…Otherwise, at a cost of increased gradient complexity, each device can query the oracle O(1/ 2 ) times every round, average the results and make the stochastic gradient variance O(1/ 2 ). This procedure make the bound vanishing and leads to a gradient complexity matching the one of [43] given for the federated learning scenario. Fig.…”
Section: B Non-convex Loss Functionmentioning
confidence: 98%
“…The authors also prove sub-linear convergence for Local SGDA under diminishing stepsizes. Their convergence guarantees is then improved by [26] to match the results of centralized SGDA [15]. However, we note that all these algorithms require diminishing learning rates to obtain exact solutions, which suffer from relatively slow convergence speed, but our algorithm allows constant stepsizes and hence linear convergence can be achieved.…”
Section: Related Workmentioning
confidence: 99%
“…By solving min x∈R d max y ≤1 1 m m i=1 f i (x, y), we obtain a global robust model of the linear regression problem even under the worst contamination of gross noise. To measure the convergence of algorithms, we use the robust loss, i.e., given a model x, the corresponding robust loss [25,26] is defined by f (x) = max y ≤1 m i=1 f i (x, y). We generate local models and data as follows: the local model x * i is generated by a multivariate normal distribution.…”
Section: Robust Linear Regressionmentioning
confidence: 99%
See 1 more Smart Citation
“…[2020], Lu et al [2020], Yan et al [2020], Guo et al [2021], Sharma et al [2022]. Among them, [Zhang et al, 2021b] achieved the optimal complexity O( √ κǫ −2 ) in the deterministic case by introducing the Catalyst acceleration scheme [Lin et al, 2015, Paquette et al, 2018 into minimax problems, and Luo et al [2020], Zhang et al [2021b] achieved the best complexity in the finite-sum case for now, which are O( √ nκ 2 ǫ −2 ) and O(n 3/4 √ κǫ −2 ), respectively.…”
Section: Literature Reviewmentioning
confidence: 99%