2021
DOI: 10.1109/tac.2020.2972824
|View full text |Cite
|
Sign up to set email alerts
|

Push–Pull Gradient Methods for Distributed Optimization in Networks

Abstract: In this paper, we focus on solving a distributed convex optimization problem in a network, where each agent has its own convex cost function and the goal is to minimize the sum of the agents' cost functions while obeying the network connectivity structure. In order to minimize the sum of the cost functions, we consider new distributed gradient-based methods where each node maintains two estimates, namely, an estimate of the optimal decision variable and an estimate of the gradient for the average of the agents… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
176
0
6

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 245 publications
(184 citation statements)
references
References 47 publications
2
176
0
6
Order By: Relevance
“…For another example, in simulation-based optimization, the gradient estimation often incurs noise that can be due to various sources, such as modeling and discretization errors, incomplete convergence, and finite sample size for Monte-Carlo methods [22]. Distributed algorithms dealing with problem (1) have been studied extensively in the literature [56,36,37,28,19,20,52,13,46,34,45]. Recently, there has been considerable interest in distributed implementation of stochastic gradient algorithms [48,54,14,3,5,55,6,9,10,7,32,24,26,40,51,41,18].…”
Section: Scenarios In Which Problemmentioning
confidence: 99%
“…For another example, in simulation-based optimization, the gradient estimation often incurs noise that can be due to various sources, such as modeling and discretization errors, incomplete convergence, and finite sample size for Monte-Carlo methods [22]. Distributed algorithms dealing with problem (1) have been studied extensively in the literature [56,36,37,28,19,20,52,13,46,34,45]. Recently, there has been considerable interest in distributed implementation of stochastic gradient algorithms [48,54,14,3,5,55,6,9,10,7,32,24,26,40,51,41,18].…”
Section: Scenarios In Which Problemmentioning
confidence: 99%
“…It is shown in [19] that the oracle complexity with doublystochastic weights is O(Q 2 log 1 ). Extensions of AB include: non-coordinated step-sizes and heavy-ball momentum [32]; time-varying graphs [36], [37]; analysis for non-convex functions [38]. Related work on distributed Nesterov-type methods can be found in [39]- [41], which is restricted to undirected graphs.…”
Section: A Centralized Optimization: Nesterov's Methodsmentioning
confidence: 99%
“…Since these variants only require CS weights, AB and ABN are preferable due to their faster convergence. It is further straightforward to conceive a timevarying implementation of ABN and FROZEN over gossip based protocols or random graphs, see e.g., the related work in [36], [37] on non-accelerated methods. Asynchronous schemes may also be derived following the methodologies studied in [42], [43].…”
Section: Algorithm 2 Frozenmentioning
confidence: 99%
“…In [8] und [12] werden verteilte Gradientenverfahren für unbeschränkte Optimierungen betrachtet. [8] verwendet das Push-Sum-Konsensus-Protokoll zur Berechnung von verteilten Informationen und beweist, dass eine Konvergenz auf ein Optimum auch bei zeitlichen Änderungen der Kommunikationsverbindungen möglich ist.…”
Section: Introductionunclassified
“…[8] verwendet das Push-Sum-Konsensus-Protokoll zur Berechnung von verteilten Informationen und beweist, dass eine Konvergenz auf ein Optimum auch bei zeitlichen Änderungen der Kommunikationsverbindungen möglich ist. [12] untersucht ein Push-Pull-Konsensus-Protokoll zur Lösung eines verteilten Optimierungsproblems und kann eine lineare Konvergenz des dort vorgestellten Algorithmus nachweisen. [5] untersucht ebenfalls unbeschränkte Gradi-entenverfahren, die auf einer verteilten Ausführung des Nesterov-Gradienten-Verfahrens beruhen und dadurch eine schnelle Konvergenzrate aufweisen.…”
Section: Introductionunclassified