2022
DOI: 10.48550/arxiv.2203.09079
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

On the convergence of decentralized gradient descent with diminishing stepsize, revisited

Abstract: Distributed optimization has received a lot of interest in recent years due to its wide applications in various fields. In this work, we revisit the convergence property of the decentralized gradient descent [A. Nedić-A.Ozdaglar (2009)] on the whole space given bywhere the stepsize is given as α(t) = a (t+w) p with 0 < p ≤ 1. Under the strongly convexity assumption on the total cost function f with local cost functions fi not necessarily being convex, we show that the sequence converges to the optimizer with r… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

1
4
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(5 citation statements)
references
References 15 publications
1
4
0
Order By: Relevance
“…The convergence estimates of (1.3) have been established well in the previous works [20,21,27,9]. For a constant stepsize, Nedić-Ozdaglar [20] showed that the sequence converges to an O(α)-neighborhood of the optimal set.…”
Section: Introductionsupporting
confidence: 58%
See 2 more Smart Citations
“…The convergence estimates of (1.3) have been established well in the previous works [20,21,27,9]. For a constant stepsize, Nedić-Ozdaglar [20] showed that the sequence converges to an O(α)-neighborhood of the optimal set.…”
Section: Introductionsupporting
confidence: 58%
“…It was shown that the sequence with constant stepsize α(t) ≡ α > 0 (smaller than a specific value) converges to an O(α)-neighborhood of the optimal point exponentially fast if the global cost function is strongly convex. This result was extended [9] to the case with decreasing stepsize of the form α(t) = c/(t + w) α for 0 < α ≤ 1. We mention that, if Ω = R n , the convergence analysis for (1.2) becomes more challenging due to the projection operator.…”
Section: Introductionmentioning
confidence: 92%
See 1 more Smart Citation
“…Over the lasd decades, various distributed algorithms have been proposed in the literature, such as alternating direction method of multipliers 21,35 , distributed dual averaging 14,37 , distributed gradient descent(DGD) 8,23,24,43 , and distributed newton method 22 . We refer to 27,39 for a survey on the distributed optimization.…”
Section: Introductionmentioning
confidence: 99%
“…By the way, the convergence property of QDGD has not been discovered fully considering the convergence results of the DGD which corresponds to the QDGD with 𝜎 = 0. Specifically, it was proved in the works 43,8 that the DGD algorithm converges exponentially fast to a neighborhood of the optimizer if the stepsize is given by a constant less than a specific value depending on the cost functions and the aggregate cost function is strongly convex. Here, the radius of the neighborhood is comparable to the size of the stepsize.…”
Section: Introductionmentioning
confidence: 99%