Nonlinear Acceleration of Momentum and Primal-Dual Algorithms

Bollapragada, Raghu; Scieur, Damien; d’Aspremont, Alexandre

doi:10.48550/arxiv.1810.04539

Cited by 7 publications

(10 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Therefore, accelerating the convergence rate of the fixed point iteration has attracted considerable interest. Anderson Acceleration (AA) (Anderson, 1965) is among the most popular techniques to speed up the convergence of fixed point iteration (Bollapragada et al, 2018;Scieur et al, 2018;Walker and Ni, 2011). The key idea behind the AA strategy is to maintain the history of h recent iterations and predict the new iteration by using a linear combination of this history where the weights are extracted by solving an optimization problem.…”

Section: Fixed Point Iteration and Anderson Accelerationmentioning

confidence: 99%

Anderson accelerated augmented Lagrangian for extended waveform inversion

Aghazade¹,

Gholami²,

Aghamiry³

et al. 2021

Preprint

View full text Add to dashboard Cite

The augmented Lagrangian (AL) method provides a flexible and efficient framework for solving extended-space full-waveform inversion (FWI), a constrained nonlinear optimization problem whereby we seek model parameters and wavefields that minimize the data residuals and satisfy the wave equation constraint. The AL-based wavefield reconstruction inversion, also known as iteratively refined wavefield reconstruction inversion, extends the search space of FWI in the source dimension and decreases sensitivity of the inversion to the initial model accuracy. Furthermore, it benefits from the advantages of the alternating direction method of multipliers (ADMM), such as generality and decomposability for dealing with non-differentiable regularizers, e.g., total variation regularization, and large scale problems, respectively. In practice any extension of the method aiming at improving its convergence and decreasing the number of wave-equation solves would have a great importance. To achieve this goal, we recast the method as a general fixed-point iteration problem, which enables us to apply sophisticated acceleration strategies like Anderson acceleration. The accelerated algorithm stores a predefined number of previous iterates and uses their linear combination together with the current iteration to predict the next iteration. We investigate the performance of the proposed accelerated algorithm on a simple checkerboard model and the benchmark Marmousi II and 2004 BP salt models through numerical examples. These numerical results confirm the effectiveness of the proposed algorithm in terms of convergence rate and the quality of the final estimated model.

show abstract

Section: Fixed Point Iteration and Anderson Accelerationmentioning

confidence: 99%

Anderson accelerated augmented Lagrangian for extended waveform inversion

Aghazade¹,

Gholami²,

Aghamiry³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The quality of the bound (in particular, its eventual convergence to 0) crucially depends on P (T ) . Using the Crouzeix conjecture (Crouzeix, 2004) Bollapragada et al (2018 managed to bound P (T ) , with P a polynomial:…”

Section: Anderson Extrapolation For Nonsymmetric Iteration Matricesmentioning

confidence: 99%

“…As we recall below, results on Anderson acceleration mainly concern fixed-point iterations with symmetric iteration matrices T , and results concerning non-symmetric iteration matrices are weaker (Bollapragada et al, 2018). Poon and Liang (2020, Thm 6.4) do not assume that T is symmetric, but only diagonalizable, which is still a strong requirement.…”

Section: Introductionmentioning

confidence: 99%

Anderson acceleration of coordinate descent

Bertrand¹,

Massias²

2020

Preprint

View full text Add to dashboard Cite

Acceleration of first order methods is mainly obtained via inertial techniques à la Nesterov, or via nonlinear extrapolation. The latter has known a recent surge of interest, with successful applications to gradient and proximal gradient techniques. On multiple Machine Learning problems, coordinate descent achieves performance significantly superior to full-gradient methods. Speeding up coordinate descent in practice is not easy: inertially accelerated versions of coordinate descent are theoretically accelerated, but might not always lead to practical speed-ups. We propose an accelerated version of coordinate descent using extrapolation, showing considerable speed up in practice, compared to inertial accelerated coordinate descent and extrapolated (proximal) gradient descent. Experiments on least squares, Lasso, elastic net and logistic regression validate the approach.

show abstract

“…However, AA and optimization algorithms have been developed quite independently and only limited connections were discovered and studied [16,18]. Very recently, the technique has started to gain a significant interest in the optimization community (see, e.g., [47,46,5,53,19,39]). Specifically, a series of papers [47,46,5] adapt AA to accelerate several classical algorithms for unconstrained optimization; [53] studies a variant of AA for non-expansive operators; [19] proposes an application of AA to Douglas-Rachford splitting; and [39] uses AA to improve the performance of the ADMM method.…”

Section: Related Workmentioning

confidence: 99%

“…Very recently, the technique has started to gain a significant interest in the optimization community (see, e.g., [47,46,5,53,19,39]). Specifically, a series of papers [47,46,5] adapt AA to accelerate several classical algorithms for unconstrained optimization; [53] studies a variant of AA for non-expansive operators; [19] proposes an application of AA to Douglas-Rachford splitting; and [39] uses AA to improve the performance of the ADMM method. There is also an emerging literature on applications of AA in machine learning [23,28,20,36].…”

Section: Related Workmentioning

confidence: 99%

Anderson Acceleration of Proximal Gradient Methods

Mai,

Johansson

2019

Preprint

View full text Add to dashboard Cite

Anderson acceleration is a well-established and simple technique for speeding up fixedpoint computations with countless applications. Previous studies of Anderson acceleration in optimization have only been able to provide convergence guarantees for unconstrained and smooth problems. This work introduces novel methods for adapting Anderson acceleration to (non-smooth and constrained) proximal gradient algorithms. Under some technical conditions, we extend the existing local convergence results of Anderson acceleration for smooth fixed-point mappings to the proposed scheme. We also prove analytically that it is not, in general, possible to guarantee global convergence of native Anderson acceleration. We therefore propose a simple scheme for stabilization that combines the global worst-case guarantees of proximal gradient methods with the local adaptation and practical speed-up of Anderson acceleration.

show abstract

Nonlinear Acceleration of Momentum and Primal-Dual Algorithms

Cited by 7 publications

References 18 publications

Anderson accelerated augmented Lagrangian for extended waveform inversion

Anderson accelerated augmented Lagrangian for extended waveform inversion

Anderson acceleration of coordinate descent

Anderson Acceleration of Proximal Gradient Methods

Contact Info

Product

Resources

About