2019
DOI: 10.1609/aaai.v33i01.33015033
|View full text |Cite
|
Sign up to set email alerts
|

Non-Ergodic Convergence Analysis of Heavy-Ball Algorithms

Abstract: In this paper, we revisit the convergence of the Heavyball method, and present improved convergence complexity results in the convex setting. We provide the first non-ergodic O(1/k) rate result of the Heavy-ball algorithm with constant step size for coercive objective functions. For objective functions satisfying a relaxed strongly convex condition, the linear convergence is established under weaker assumptions on the step size and inertial parameter than made in the existing literature. We extend our results … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
31
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 31 publications
(32 citation statements)
references
References 13 publications
1
31
0
Order By: Relevance
“…For b > 0, the function V decreases strictly along the trajectories. Time-discretized variants of the Lyapunov function (2.19) were used [48,63,64] to establish global convergence of the heavy-ball method for particular choices of α k and γ k , and different classes of objective functions f . Siegel [65] used Lyapunov-function arguments to show that, for μ-strongly convex functions f , choosing b = 2 √ μ leads to the estimate…”
Section: The Heavy-ball Methods Versus Nonlinear Conjugate Gradientsmentioning
confidence: 99%
See 1 more Smart Citation
“…For b > 0, the function V decreases strictly along the trajectories. Time-discretized variants of the Lyapunov function (2.19) were used [48,63,64] to establish global convergence of the heavy-ball method for particular choices of α k and γ k , and different classes of objective functions f . Siegel [65] used Lyapunov-function arguments to show that, for μ-strongly convex functions f , choosing b = 2 √ μ leads to the estimate…”
Section: The Heavy-ball Methods Versus Nonlinear Conjugate Gradientsmentioning
confidence: 99%
“…More precisely, the Lipschitz constant L and the strong monotonicity constant μ of the gradient of f are required. Then, the convergence of the heavy-ball method is deduced by Lyapunov's first or second method, deducing local [46,49] or global [48,64] convergence, respectively. The convergence results are typically very strong, involving a qlinear convergence towards the minimum.…”
Section: The Heavy-ball Methods Versus Nonlinear Conjugate Gradientsmentioning
confidence: 99%
“…Note that the HB method (20) adds a momentum term up to the gradient step and is sensitive to its parameters. For f ∈ S 1,1 μ,L , it shares the same theoretical convergence rate (6) as the gradient descent method; see [18,40]. To our best knowledge, no work has established the global accelerated rate (17) for the original HB method (20).…”
Section: Related Workmentioning
confidence: 99%
“…The ODE model (40) given in Sect. 2.3 cannot treat the case μ = 0 and the previous spectral analysis fails.…”
Section: Dynamic Time Rescaling For the Convex Casementioning
confidence: 99%
See 1 more Smart Citation