2015
DOI: 10.48550/arxiv.1507.06970
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Perturbed Iterate Analysis for Asynchronous Stochastic Optimization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
56
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 36 publications
(56 citation statements)
references
References 0 publications
0
56
0
Order By: Relevance
“…Proof. This is the first step of the perturbed iterate analysis framework Mania et al [2015]. We follow the steps as in Stich et al [2018].…”
Section: A1 Proof Of the Main Theoremmentioning
confidence: 99%
“…Proof. This is the first step of the perturbed iterate analysis framework Mania et al [2015]. We follow the steps as in Stich et al [2018].…”
Section: A1 Proof Of the Main Theoremmentioning
confidence: 99%
“…All the transmitters/receivers are equipped with 4 antenna; we simulated uncorrelated fading channels, whose coefficients are Gaussian distributed with zero mean and variance 1/d 3 i j (all the channel matrices are full-column rank); and we set R n i = σ 2 I for all i, and snr p/σ 2 = 3dB. In MIMO-SR-FLEXA, we used the step-size rule (108), with ε = 1e-5; in (153) we set τ i = 0 and computed Qi (Q k ) using the closed form solution in [123]. All the algorithms reach the same average sum-rate.…”
Section: Sum-rate Maximization Over Mimo Interference Channelsmentioning
confidence: 99%
“…Although asynchronous block-methods have a long history (see, e.g., [5,16,45,87,237]), in the past few years, the study of asynchronous parallel optimization methods has witnessed a revival of interest. Indeed, asynchronous parallelism has been applied to many state-of-the-art optimization algorithms (mainly for convex objective functions and constraints), including stochastic gradient methods [109,137,144,153,167,184,195] and ADMM-like schemes [105,110,247]. The asynchronous counterpart of BCD methods has been introduced and studied in the seminal work [146], which motivated and oriented much of subsequent research in the field, see e.g.…”
Section: Ii7 Sources and Notesmentioning
confidence: 99%
See 1 more Smart Citation
“…A large number of recent studies revisited the idea of low-precision training as a means to reduce communication (Seide et al, 2014;De Sa et al, 2015;Alistarh et al, 2017;Zhou et al, 2016;Wen et al, 2017;Zhang et al, 2017;De Sa et al, 2017;Bernstein et al, 2018a;. Other approaches for low-communication training focus on sparsification of gradients, either by thresholding small entries or by random sampling (Strom, 2015;Mania et al, 2015;Suresh et al, 2016;Leblond et al, 2016;Aji & Heafield, 2017;Lin et al, 2017;Chen et al, 2017;Renggli et al, 2018;Tsuzuku et al, 2018;Wang et al, 2018;Vogels et al, 2019).…”
Section: Introductionmentioning
confidence: 99%