2019
DOI: 10.48550/arxiv.1904.09015
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Decentralized and Parallel Primal and Dual Accelerated Methods for Stochastic Convex Programming Problems

Abstract: We introduce primal and dual stochastic gradient oracle methods for decentralized convex optimization problems. Both for primal and dual oracles the proposed methods are optimal in terms of the number of communication steps. However, for all classes of the objective, the optimality in terms of the number of oracle calls per node in the class of methods with optimal number of communications steps takes place only up to a logarithmic factor and the notion of smoothness (the worst case vs the average one). We als… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

4
30
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
6
1
1

Relationship

5
3

Authors

Journals

citations
Cited by 13 publications
(34 citation statements)
references
References 48 publications
4
30
0
Order By: Relevance
“…Accelerated Methods for Strongly Convex and Smooth Decentralized Optimization. The accelerated methods which can be applied to this scenario include the accelerated distributed Nesterov gradient descent (Acc-DNGD) (Qu & Li, 2020), the robust distributed accelerated stochastic gradient method (Fallah et al, 2019), the multi-step dual accelerated method (Scaman et al, 2017;, accelerated penalty method (APM) (Li et al, 2020a;Dvinskikh & Gasnikov, 2019), the multi-consensus decentralized accelerated gradient descent (Mudag) (Ye et al, 2020a;, accelerated EXTRA Li et al, 2020b), the decentralized accelerated augmented Lagrangian method (Arjevani et al, 2020), and the accelerated proximal alternating predictor-corrector method (APAPC) (Koralev et al, 2020). Scaman et al (2017; proved the Ω( L µ(1−σ) log 1 ) (see the notations in Section 1.3) communication complexity lower bound and the Ω( L µ log 1 ) gradient computation complexity lower bound.…”
Section: Literature Reviewmentioning
confidence: 99%
See 1 more Smart Citation
“…Accelerated Methods for Strongly Convex and Smooth Decentralized Optimization. The accelerated methods which can be applied to this scenario include the accelerated distributed Nesterov gradient descent (Acc-DNGD) (Qu & Li, 2020), the robust distributed accelerated stochastic gradient method (Fallah et al, 2019), the multi-step dual accelerated method (Scaman et al, 2017;, accelerated penalty method (APM) (Li et al, 2020a;Dvinskikh & Gasnikov, 2019), the multi-consensus decentralized accelerated gradient descent (Mudag) (Ye et al, 2020a;, accelerated EXTRA Li et al, 2020b), the decentralized accelerated augmented Lagrangian method (Arjevani et al, 2020), and the accelerated proximal alternating predictor-corrector method (APAPC) (Koralev et al, 2020). Scaman et al (2017; proved the Ω( L µ(1−σ) log 1 ) (see the notations in Section 1.3) communication complexity lower bound and the Ω( L µ log 1 ) gradient computation complexity lower bound.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The accelerated methods for this scenario are much scarcer. Examples include the distributed Nesterov gradient with consensus (Jakovetić et al, 2014a), Acc-DNGD (Qu & Li, 2020), APM (Li et al, 2020a;Dvinskikh & Gasnikov, 2019), accelerated EXTRA , and the accelerated dual ascent (Uribe et al, 2021), where the last one adds a small regularizer to translate the problem to a strongly convex and smooth one. Scaman et al (2019) proved the Ω( L (1−σ) ) communication complexity lower bound and the Ω( L ) gradient computation complexity lower bound.…”
Section: Literature Reviewmentioning
confidence: 99%
“…In particular, Scaman et al (2017) established lower decentralized communication and local computation complexities for solving this problem, and proposed an optimal algorithm called MSDA in the case when an access to the dual oracle (gradient of the Fenchel transform of the objective function) is assumed. Under a primal oracle (gradient of the objective function), current state of the art includes the near-optimal algorithms APM-C (Li et al, 2018;Dvinskikh and Gasnikov, 2019) and Mudag (Ye et al, 2020), and a recently proposed optimal algorithm OPAPC (Kovalev et al, 2020).…”
Section: Contributionsmentioning
confidence: 99%
“…where f i (x ) is a private smooth and convex function possessed by agent i, and x is the global decision variable. Decentralized optimization has found wide applications in many modern scientific more generally, use doubly stochastic mixing matrices [12,13,17,20,22,23,24,41,61]. Among these works, OPAPC [20] simultaneously achieves the lower bounds on the gradient computation complexity and the communication complexity over undirected graphs which were given in [45].…”
Section: Introductionmentioning
confidence: 99%
“…Among these works, OPAPC [20] simultaneously achieves the lower bounds on the gradient computation complexity and the communication complexity over undirected graphs which were given in [45]. Some works [12,17,22,23,61] rely on an inner loop of multiple consensus steps to guarantee the accelerated convergence. The use of inner loops may limit the applications of these methods due to the communication bottleneck in decentralized computation [21,24,41].…”
Section: Introductionmentioning
confidence: 99%