Stochastic Proximal Gradient Consensus Over Random Networks

Mei, Hong; Chang, Tsung-Hui

doi:10.48550/arxiv.1511.08905

Cited by 7 publications

(15 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…where the authors also develop a distributed algorithm based on linearized ADMM for solving (3) over both random and static networks and they attain similar rate results to ours. For the static network setting, their algorithm achieves O(1/t) rate using deterministic gradient and O(1/ √ t) rate using the stochastic gradient; however, in contrast to our results, these rates are established assuming bounded domain for all ξ i (for both deterministic and stochastic gradient settings); explicit bounds for suboptimality and infeasibility are not separately provided; and when the gradient is noisy, their algorithm does not have a compact characterization using only primal local decisions (see Theorem 4.2 and Algorithm 1 in [26]) -even if the network is static, in case the gradient is noisy, one needs to use Algorithm 1 which requires updating edge-variables and explicitly computing the dual variables, while our algorithm SDPGA using stochastic gradient is in a compact form updating only primal node-variables, does not explicitly compute the dual iterates and still achieves O(1/ √ t) rate without assuming compact domain for any ξ i .…”

Section: B Related Workcontrasting

confidence: 78%

“…We will illustrate their practical performance in Section IV. After we started writing this paper, we became aware of other recent work [2], [24]- [26] for solving (3) over a connected graph G. These methods are very closely related to our proximal gradient ADMM (PG-ADMM), and are based on linearized ADMM method. Suppose Φ i (x) = ξ i (x) + f i (A i x).…”

Section: B Related Workmentioning

confidence: 99%

“…The methods in [24] run on edge-based formulations of the decentralized problem; consequently, information exchange, computational effort and memory requirement are far more expensive than node-based algorithms proposed in this paper. While we are finalizing our paper, we become aware of the work [26],…”

Section: B Related Workmentioning

confidence: 99%

See 2 more Smart Citations

Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization

Aybat

Wang

Lin

2018

IEEE Trans. Automat. Contr.

133

View full text Add to dashboard Cite

Given an undirected graph G = (N , E) of agents N = {1, . . . , N } connected with edges in E, we study how to compute an optimal decision on which there is consensus among agents and that minimizes the sum of agent-specific private convex composite functions {Φ i } i∈N while respecting privacy requirements, where Φ i ξ i + f i belongs to agent-i. Assuming only agents connected by an edge can communicate, we propose a distributed proximal gradient method DPGA for consensus optimization over both unweighted and weighted static (undirected) communication networks. In one iteration, each agent-i computes the prox map of ξ i and gradient of f i , and this is followed by local communication with neighboring agents. We also study its stochastic gradient variant, SDPGA, which can only access to noisy estimates of ∇f i at each agent-i. This computational model abstracts a number of applications in distributed sensing, machine learning and statistical inference. We show ergodic convergence in both sub-optimality error and consensus violation for DPGA and SDPGA with rates O(1/t) and O(1/ √ t),This computational setting, i.e., decentralized consensus optimization, appears as a generic model for various applications in signal processing, e.g., [2]-[6], machine learning, e.g., [7]- [9] and statistical inference, e.g., [10], [11]. Clearly, (3) can also be solved in a "centralized" fashion by communicating all the private functions Φ i to a central node, and solving the overall problem at this node. However, such an approach can be very expensive both from communication January 3, 2017 DRAFT solutionx = [x i ] i∈N such that its consensus violation max{ x i −x j 2 : (i, j) ∈ E} ≤ within O(1) iterations; and its suboptimality is bounded from above as i∈N Φ i (x i ) − F * ≤ within O(1/ 2 ) iterations; however, since the step size is constant, neither suboptimality nor consensus errors are guaranteed to decrease further. Although these algorithms are for more general problems and assume mere convexity on each Φ i , this generality comes at the cost of O(1/ 2 ) complexity bounds, and they also tend to be very slow in practice. On the other extreme, under much stronger conditions: assuming each Φ i is smooth and has bounded gradients, Jakovetic et al. [19] developed a fast distributed gradient method D-NC with O(log(1/ )/ √ ) convergence rate in communication rounds. For the quadratic loss, which is one of the most commonly used loss functions, bounded gradient assumption does not hold. In terms of distributed applicability, D-NC requires all the nodes N to agree on a doubly stochastic weight matrix W ∈ R |N |×|N | ; it also assumes that the second largest eigenvalue of W ∈ R |N |×|N | is known globally among all the nodes -this is not attainable for very large scale fully distributed networks. D-NC is a two-loop algorithm: for each outer loop k, each node computes their gradients once, and it is followed by O(log(k)) communication rounds. In the rest, we briefly discuss those algorithms that balance the trade-off between the iterati...

show abstract

Section: B Related Workcontrasting

confidence: 78%

Section: B Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization

Aybat

Wang

Lin

2018

IEEE Trans. Automat. Contr.

133

View full text Add to dashboard Cite

show abstract

“…Until now, we have rewritten the primal-dual algorithm (9) as a fixed-point iteration (27) with operator T defined in (26). Besides, we also established that FixT coincide with the solutions to the KKT system (23). What still needs to be proved is that {Z k } generated through the fixed-point iteration (27) will converge to FixT .…”

Section: B Convergence Of Algorithmmentioning

confidence: 92%

“…Corollary 1. Under Assumption 1, Algorithm 1 produces Z k that converges to a solution Z * = [X * ; Y * ] to the KKT system (23), which is also a saddle point to Problem (8). Therefore, X * is a solution to Problem (7).…”

Section: B Convergence Of Algorithmmentioning

confidence: 99%

Decentralized consensus optimization with asynchrony and delays

Yuan

Ling

et al. 2016

2016 50th Asilomar Conference on Signals, Systems and Computers

View full text Add to dashboard Cite

We propose an asynchronous, decentralized algorithm for consensus optimization. The algorithm runs over a network in which the agents communicate with their neighbors and perform local computation.In the proposed algorithm, each agent can compute and communicate independently at different times, for different durations, with the information it has even if the latest information from its neighbors is not yet available. Such an asynchronous algorithm reduces the time that agents would otherwise waste idle because of communication delays or because their neighbors are slower. It also eliminates the need for a global clock for synchronization.Mathematically, the algorithm involves both primal and dual variables, uses fixed step-size parameters, and provably converges to the exact solution under a random agent assumption and both bounded and unbounded delay assumptions. When running synchronously, the algorithm performs just as well as existing competitive synchronous algorithms such as PG-EXTRA, which diverges without synchronization. Numerical experiments confirm the theoretical findings and illustrate the performance of the proposed algorithm.

show abstract

A Bregman Splitting Scheme for Distributed Optimization Over Networks

Zhu

Soh

et al. 2018

IEEE Trans. Automat. Contr.

View full text Add to dashboard Cite

We consider distributed optimization problems in which a group of agents are to collaboratively seek the global optimum through peer-to-peer communication networks. The problem arises in various application areas, such as resource allocation, sensor fusion and distributed learning. We propose a general efficient distributed algorithm-termed Distributed Forward-Backward Bregman Splitting (D-FBBS)-to simultaneously solve the above primal problem as well as its dual based on Bregman method and operator splitting. The proposed algorithm allows agents to communicate asynchronously and thus lends itself to stochastic networks. This algorithm belongs to the family of general proximal point algorithms and is shown to have close connections with some existing well-known algorithms when dealing with fixed networks. However, we will show that it is generally different from the existing ones due to its effectiveness in handling stochastic networks. With proper assumptions, we establish a non-ergodic convergence rate of o(1/k) in terms of fixed point residuals over fixed networks both for D-FBBS and its inexact version (ID-FBBS) which is more computationally efficient and an ergodic convergence rate of O(1/k) for D-FBBS over stochastic networks respectively. We also apply the proposed algorithm to sensor fusion problems to show its superior performance compared to existing methods.

show abstract

Stochastic Proximal Gradient Consensus Over Random Networks

Cited by 7 publications

References 36 publications

Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization

Distributed Linearized Alternating Direction Method of Multipliers for Composite Convex Consensus Optimization

Decentralized consensus optimization with asynchrony and delays

A Bregman Splitting Scheme for Distributed Optimization Over Networks

Contact Info

Product

Resources

About