2020
DOI: 10.1080/10556788.2020.1746963
|View full text |Cite
|
Sign up to set email alerts
|

Asynchronous variance-reduced block schemes for composite non-convex stochastic optimization: block-specific steplengths and adapted batch-sizes

Abstract: We consider the minimization of a sum of an expectation-valued coordinate-wise L i -smooth nonconvex function and a nonsmooth block-separable convex regularizer. Prior schemes are characterized by the following shortcomings: (a) Steplengths require global knowledge of Lipschitz constants; (b) Batchsizes of gradients are centrally updated and require knowledge of the global clock; (c) a.s. convergence guarantees are unavailable; (d) Rates are inferior compared to deterministic counterparts. Specifically, (a) an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(9 citation statements)
references
References 45 publications
0
9
0
Order By: Relevance
“…Based on Lemmas 1, 2, and 3, we can establish the geometric rate of convergence along with the iteration and oracle complexity of the iterates generated by Algorithms 1, 2, and 3. The proofs of Propositions 1, 2, and 3 are similar to that of [26,Theorem 4.2 and Corollary 4.7]. Related results for linear convergence for stochastic gradient methods can be found in [8,15,23,36,43] while a linear rate for the accelerated variants has been provided in [22].…”
Section: Rate and Oracle Complexitiesmentioning
confidence: 78%
See 1 more Smart Citation
“…Based on Lemmas 1, 2, and 3, we can establish the geometric rate of convergence along with the iteration and oracle complexity of the iterates generated by Algorithms 1, 2, and 3. The proofs of Propositions 1, 2, and 3 are similar to that of [26,Theorem 4.2 and Corollary 4.7]. Related results for linear convergence for stochastic gradient methods can be found in [8,15,23,36,43] while a linear rate for the accelerated variants has been provided in [22].…”
Section: Rate and Oracle Complexitiesmentioning
confidence: 78%
“…Unfortunately, SA schemes with diminishing steps cannot recover the deterministic convergence rates seen in exact gradient methods while constant steplength SA schemes are only characterized by convergence guarantees to a neighborhood of the optimal solution. Variance-reduction schemes employing an increasing batch-size of sampled gradients (instead of the unavailable true gradient) appear to have been first alluded to in [1,12,20,35] and analyzed in smooth and strongly convex [8,15,42,43], smooth convex regimes [17], nonsmooth (but smoothable) convex [22], nonconvex [26], and game-theoretic [27] regimes. Notably, the linear rates in mean-squared error were derived in for strongly convex smooth [43] and a subclass of nonsmooth objectives [22], while a rate O(1/k 2 ) and O(1/k) was obtained for expected sub-optimality in convex smooth [17,23] and nonsmooth [22], respectively.…”
Section: Introductionmentioning
confidence: 99%
“…& Management, Oklahoma State University, farzad.yousefian@okstate.edu; Yousefian acknowledges the support of the NSF through CAREER grant ECCS-1944500. nonsmoothness arises in a deterministic form . However, in many applications, f (•, ω) may be both nonconvex and nonsmooth and proximal stochastic gradient schemes [10], [14] cannot be directly adopted. We now discuss some relevant research in nonsmooth and nonconvex regimes.…”
Section: Introductionmentioning
confidence: 99%
“…In [2], the authors prove that a limit point of a subsequence is an ǫ-Clarke stationary point when f is locally Lipschitz while Kiwiel proved that every limit point is Clarke stationary with respect to f without requiring compactness of level sets [13]. There have also been efforts to develop statements in structured regimes where f is either weakly convex [7], [8] or f = g+h and h is smooth and possibly nonconvex while g is convex, nonsmooth, and proximable [14], [21].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation