2015
DOI: 10.1007/s11590-015-0936-x
|View full text |Cite
|
Sign up to set email alerts
|

On the global convergence rate of the gradient descent method for functions with Hölder continuous gradients

Abstract: The gradient descent method minimizes an unconstrained nonlinear optimization problem with O(1/ √ K ), where K is the number of iterations performed by the gradient method. Traditionally, this analysis is obtained for smooth objective functions having Lipschitz continuous gradients. This paper aims to consider a more general class of nonlinear programming problems in which functions have Hölder continuous gradients. More precisely, for any function f in this class, denoted by C 1,ν L , there is a ν ∈ (0, 1] an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
21
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(21 citation statements)
references
References 30 publications
0
21
0
Order By: Relevance
“…• A byproduct of our work is a global convergence result for Hölder methods, which were earlier investigated in the literature [5,19,28,38].…”
Section: Contributionsmentioning
confidence: 78%
“…• A byproduct of our work is a global convergence result for Hölder methods, which were earlier investigated in the literature [5,19,28,38].…”
Section: Contributionsmentioning
confidence: 78%
“…There are many results which consider notions other than smoothness and strong convexity for first-order methods. Some examples of this is work on star-convexity [GG17, NGGD18, HSS20], quasi-strong convexity [NNG19], semi-convexity [VNP07], the quadratic growth condition [Ani00], the error bound property [LT93,FHKO10], restricted strong convexity [ZY13,ZC15] and Hölder continuity [ZY13,DGN14,Yas16,Gri19]. However, we are unaware of notions of fine-grained condition numbers for non-linear or stochastic problems appearing previously in the literature.…”
Section: Prior Workmentioning
confidence: 99%
“…We present an example of optimization method, which uses inequality (2) to assert the convergence of the method to a stationary point. To compute the step size, the method requires an L satisfying (2). For a given such L, the global complexity of the method is proportional to L 1/ν .…”
Section: Motivationmentioning
confidence: 99%
“…For example, in the global convergence analysis found in [2,Section 3], the function is assumed to have Hölder continuous gradient, and the very first step is a lemma, stating that this property implies the existence of a global upper bound on the error of the first-order Taylor approximation of the function. In fact, the main complexity result [2,Corollary 2] can be obtained by assuming the global upper bound on the error of the first-order Taylor approximation, while disregarding the Hölder continuity of the gradient.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation