2020
DOI: 10.48550/arxiv.2003.03910
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Geometry of First-Order Methods and Adaptive Acceleration

Abstract: First-order operator splitting methods are ubiquitous among many fields through science and engineering, such as inverse problems, signal/image processing, statistics, data science and machine learning, to name a few. In this paper, we study a geometry property of first-order methods when applying to solve non-smooth optimization problems. With the tool "partial smoothness", we design a framework to analyze the trajectory of the fixed-point sequence generated by first-order methods and show that locally the fi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
6
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 71 publications
(110 reference statements)
1
6
0
Order By: Relevance
“…However, similarly to Mai and Johansson (2019) and Poon and Liang (2020) we observed that regularizing the linear system does not seem necessary, and can even hurt the convergence speed. Figure 5 shows the influence of the regularization parameter on the convergence on the rcv1 dataset for a sparse logistic regression problem, with K = 5 and λ = λ max /30.…”
Section: Parameter Settingsupporting
confidence: 84%
See 2 more Smart Citations
“…However, similarly to Mai and Johansson (2019) and Poon and Liang (2020) we observed that regularizing the linear system does not seem necessary, and can even hurt the convergence speed. Figure 5 shows the influence of the regularization parameter on the convergence on the rcv1 dataset for a sparse logistic regression problem, with K = 5 and λ = λ max /30.…”
Section: Parameter Settingsupporting
confidence: 84%
“…We compare multiple algorithms to solve popular Machine Learning problems: the Lasso, the elastic net, and sparse logistic regression (experiments on group lasso are in Appendix A.5). The compared algorithms are the following: proximal gradient descent (PGD, Combettes and Wajs 2005), Nesterov-like inertial PGD (FISTA, Beck and Teboulle 2009), Anderson accelerated PGD (Mai and Johansson, 2019;Poon and Liang, 2020), proximal coordinate descent (PCD, Tseng and Yun 2009), inertial PCD (Lin et al, 2014;Fercoq and Richtárik, 2015), Anderson accelerated PCD (ours). We use datasets from libsvm (Fan et al, 2008) and openml (Feurer et al, 2019) (Table 1), varying as much as possible to demonstrate the versatility of our approach.…”
Section: Numerical Comparison On Machine Learning Problemsmentioning
confidence: 99%
See 1 more Smart Citation
“…It was shown in [30,Section 4] by example that one-step inertial extrapolation w n = x n +θ(x n −x n−1 ), θ ∈ [0, 1) may fail to provide acceleration. It was remarked in [24,Chapter 4] that the use of inertial of more than two points x n , x n−1 could provide acceleration.…”
Section: Introductionmentioning
confidence: 99%
“…Advantages of two-step proximal point algorithms. In [47,48], Poon and Liang discussed some limitations of inertial Douglas-Rachford splitting method and inertial ADMM. For example, consider the following feasibility problem in…”
Section: Introductionmentioning
confidence: 99%