2017
DOI: 10.48550/arxiv.1705.09056
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Can Decentralized Algorithms Outperform Centralized Algorithms? A Case Study for Decentralized Parallel Stochastic Gradient Descent

Abstract: Most distributed machine learning systems nowadays, including TensorFlow and CNTK, are built in a centralized fashion. One bottleneck of centralized algorithms lies on high communication cost on the central node. Motivated by this, we ask, can decentralized algorithms be faster than its centralized counterpart?Although decentralized PSGD (D-PSGD) algorithms have been studied by the control community, existing analysis and theory do not show any advantage over centralized PSGD (C-PSGD) algorithms, simply assumi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
38
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(38 citation statements)
references
References 40 publications
0
38
0
Order By: Relevance
“…Decentralized learning has many advantages over centralized learning. In [3], Lian et al rigorously proved that a decentralized algorithm has lower communication complexity and possesses the same convergence rate as those under a centralized parameter-server model. Furthermore, a centralized topology might not hold in a decentralized network where no one can be trusted enough to act as a parameter server.…”
Section: Decentralized Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…Decentralized learning has many advantages over centralized learning. In [3], Lian et al rigorously proved that a decentralized algorithm has lower communication complexity and possesses the same convergence rate as those under a centralized parameter-server model. Furthermore, a centralized topology might not hold in a decentralized network where no one can be trusted enough to act as a parameter server.…”
Section: Decentralized Learningmentioning
confidence: 99%
“…Such a decentralized network is frequently adopted in ad hoc networks, edge computing, Internet-of-Things (IoT), decentralized applications (Dapp), etc. It can greatly unleash the potential for building large-scale (even worldwide) machine learning models that can reasonably and fully maximize the utilization of computational resources [3]. Besides, devices such as mobile Fig.…”
Section: Introductionmentioning
confidence: 99%
“…Different aggregation techniques that compound the local model updates from all the clients participating in the training cycle have been presented in this area [23]. Aggregation algorithms may be divided into three types: centralized, hierarchical, and decentralized [24].…”
Section: Aggregation Algorithmsmentioning
confidence: 99%
“…Nothing is known about the slowly-decaying nonsquare-summable ones, which are generally preferable since they give similar benefits as constant stepsizes and, often, also guarantee convergence. Note that such issues also plague much of the distributed stochastic optimization literature Yuan et al [2016], Sun et al [2019], Lian et al [2017], Koloskova et al [2020], , Pu and Nedić [2020].…”
Section: Related Workmentioning
confidence: 99%