Geometry of First-Order Methods and Adaptive Acceleration

Poon, Clarice; Liang, Jingwei

doi:10.48550/arxiv.2003.03910

Cited by 6 publications

(7 citation statements)

References 71 publications

(110 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, similarly to Mai and Johansson (2019) and Poon and Liang (2020) we observed that regularizing the linear system does not seem necessary, and can even hurt the convergence speed. Figure 5 shows the influence of the regularization parameter on the convergence on the rcv1 dataset for a sparse logistic regression problem, with K = 5 and λ = λ max /30.…”

Section: Parameter Settingsupporting

confidence: 84%

“…We compare multiple algorithms to solve popular Machine Learning problems: the Lasso, the elastic net, and sparse logistic regression (experiments on group lasso are in Appendix A.5). The compared algorithms are the following: proximal gradient descent (PGD, Combettes and Wajs 2005), Nesterov-like inertial PGD (FISTA, Beck and Teboulle 2009), Anderson accelerated PGD (Mai and Johansson, 2019;Poon and Liang, 2020), proximal coordinate descent (PCD, Tseng and Yun 2009), inertial PCD (Lin et al, 2014;Fercoq and Richtárik, 2015), Anderson accelerated PCD (ours). We use datasets from libsvm (Fan et al, 2008) and openml (Feurer et al, 2019) (Table 1), varying as much as possible to demonstrate the versatility of our approach.…”

Section: Numerical Comparison On Machine Learning Problemsmentioning

confidence: 99%

“…Interestingly, numerical performances still show significant improvements on nonquadratic objectives. Anderson acceleration has been adapted to various algorithms such as Douglas-Rachford (Fu et al, 2019), ADMM (Poon and Liang, 2019) or proximal gradient descent (Zhang et al, 2018;Mai and Johansson, 2019;Poon and Liang, 2020). Among main benefits, the practical version of // does not affect x (k) 6 return x…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Anderson acceleration of coordinate descent

Bertrand¹,

Massias²

2020

Preprint

View full text Add to dashboard Cite

Acceleration of first order methods is mainly obtained via inertial techniques à la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.

show abstract

Section: Parameter Settingsupporting

confidence: 84%

Section: Numerical Comparison On Machine Learning Problemsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Anderson acceleration of coordinate descent

Bertrand¹,

Massias²

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…It was shown in [30,Section 4] by example that one-step inertial extrapolation w n = x n +θ(x n −x n−1 ), θ ∈ [0, 1) may fail to provide acceleration. It was remarked in [24,Chapter 4] that the use of inertial of more than two points x n , x n−1 could provide acceleration.…”

Section: Introductionmentioning

confidence: 99%

Two-step Inertial Method for Solving Split Common Null Point Problem With Multiple Output Sets in Hilbert Spaces

Okeke

Adamu

2023

Preprint

View full text Add to dashboard Cite

In this paper, two-step inertial extrapolation and self-adaptive step sizes is proposed to solve the split common null point problem with multiple output sets in Hilbert spaces. Weak convergence analysis are obtained under some easy to verify conditions on the iterative parameters in Hilbert spaces. Preliminary numerical tests are performed to support the theoretical analysis of our proposed method.

show abstract

“…Advantages of two-step proximal point algorithms. In [47,48], Poon and Liang discussed some limitations of inertial Douglas-Rachford splitting method and inertial ADMM. For example, consider the following feasibility problem in…”

Section: Introductionmentioning

confidence: 99%