2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton) 2012
DOI: 10.1109/allerton.2012.6483272
|View full text |Cite
|
Sign up to set email alerts
|

Distributed strongly convex optimization

Abstract: A lot of effort has been invested into characterizing the convergence rates of gradient based algorithms for non-linear convex optimization. Recently, motivated by large datasets and problems in machine learning, the interest has shifted towards distributed optimization. In this work we present a distributed algorithm for strongly convex constrained optimization. Each node in a network of n computers converges to the optimum of a strongly convex, L-Lipchitz continuous, separable objective at a rate O log ( √ n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
58
0

Year Published

2013
2013
2023
2023

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 57 publications
(60 citation statements)
references
References 13 publications
2
58
0
Order By: Relevance
“…the strong convexity (Hazan & Kale, 2011) and strong smoothness are dual properties, strongly convex programming algorithms have many benign properties both on the speed of optimization and the quality of generalization; see, for examples, (Hazan & Kale, 2011;Rakhlin et al, 2012;Tsianos & Rabbat, 2012;Kakade & Tewari, 2009). …”
Section: Theoremmentioning
confidence: 99%
“…the strong convexity (Hazan & Kale, 2011) and strong smoothness are dual properties, strongly convex programming algorithms have many benign properties both on the speed of optimization and the quality of generalization; see, for examples, (Hazan & Kale, 2011;Rakhlin et al, 2012;Tsianos & Rabbat, 2012;Kakade & Tewari, 2009). …”
Section: Theoremmentioning
confidence: 99%
“…Distributed subgradient descent fits for asynchronous networks, but suffers from slow con vergence. The descent rate of objective value is typically O(log(k)/k) where k is the number of iterations [12]. The ADMM generally needs synchronous steps taken by all the agents, but has much faster empirical convergence.…”
Section: Related Workmentioning
confidence: 99%
“…Proof Subtracting the three equations in (8) from the cor responding equations in (6) yields \7f(xk +l ) -\7f(x*) = cM + (zk -zk +l ) -M_ ((3k+l -(3* ), (12) � M!. (xk+l -x*) = (3k+l -(3k , (13) �MJ (Xk+l -x*) = Zk +l -z*, (14) respectively.…”
Section: (11)mentioning
confidence: 99%
See 2 more Smart Citations