2021
DOI: 10.1109/jsait.2021.3052975
|View full text |Cite
|
Sign up to set email alerts
|

Private Weighted Random Walk Stochastic Gradient Descent

Abstract: We consider a decentralized learning setting in which data is distributed over nodes in a graph.The goal is to learn a global model on the distributed data without involving any central entity that needs to be trusted. While gossip-based stochastic gradient descent (SGD) can be used to achieve this learning objective, it incurs high communication and computation costs, since it has to wait for all the local models at all the nodes to converge. To speed up the convergence, we propose instead to study random wal… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 13 publications
(9 citation statements)
references
References 29 publications
0
9
0
Order By: Relevance
“…In [13], the authors studied the convergence of random walks learning for the alternating direction method of multipliers (ADMM). In [21], the paper proposes to improve the convergence guarantees by designing a weighted random walk that accounts for the importance of the local data to speed up convergence. An asymptotic fundamental bound on the convergence rate of these algorithms was proven by [22] and it approaches O(1/ √ k) under convexity and bounded-gradient assumptions.…”
Section: Prior Workmentioning
confidence: 99%
See 3 more Smart Citations
“…In [13], the authors studied the convergence of random walks learning for the alternating direction method of multipliers (ADMM). In [21], the paper proposes to improve the convergence guarantees by designing a weighted random walk that accounts for the importance of the local data to speed up convergence. An asymptotic fundamental bound on the convergence rate of these algorithms was proven by [22] and it approaches O(1/ √ k) under convexity and bounded-gradient assumptions.…”
Section: Prior Workmentioning
confidence: 99%
“…The goal of a Gossip algorithm is to ensure that all nodes, and not just the PS, learn the global model and assume convergence once a consensus is reached. Hence, it is less efficient in terms of computations and communication costs [21].…”
Section: Prior Workmentioning
confidence: 99%
See 2 more Smart Citations
“…Markov chain naturally appears in many important problems, such as decentralized consensus optimization, which finds applications in various areas including wireless sensor networks, smart grid implementations and distributed statistical learning [4,14,23,48,50,58,61,63] as well as pairwise learning [78] which instantiates AUC maximization [1,29,46,81,87] and metric learning [35,75,76,79]. A common example is a distributed system in which each node stores a subset of the whole data, and one aims to train a global model based on these data.…”
Section: Introductionmentioning
confidence: 99%