2020
DOI: 10.48550/arxiv.2002.08907
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Second-order Conditional Gradient Sliding

Abstract: Constrained second-order convex optimization algorithms are the method of choice when a high accuracy solution to a problem is needed, due to the quadratic convergence rates these methods enjoy when close to the optimum. These algorithms require the solution of a constrained quadratic subproblem at every iteration. In the case where the feasible region can only be accessed efficiently through a linear optimization oracle, and computing first-order information about the function, although possible, is costly, t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…Usually, complexity of this step can be estimated as O(n 3 ) arithmetic operations, which comes from the cost of computing a suitable factorization for the Hessian matrix. Alternatively, Hessian-free gradient methods can be applied, for computing an inexact step (see [5,4]).…”
Section: Contracting-domain Newton Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Usually, complexity of this step can be estimated as O(n 3 ) arithmetic operations, which comes from the cost of computing a suitable factorization for the Hessian matrix. Alternatively, Hessian-free gradient methods can be applied, for computing an inexact step (see [5,4]).…”
Section: Contracting-domain Newton Methodsmentioning
confidence: 99%
“…are determined by the dataset 4 . First, we compare the performance of Contracting-Domain Newton Method (Algorithm 1) and Aggregating Newton Method (Algorithm 3) with first-order optimization schemes: Frank-Wolfe algorithm [14], the classical Gradient Method, and the Fast Gradient Method [26].…”
Section: Stochastic Finite-sum Minimizationmentioning
confidence: 99%
“…We used the fact that ∇h(x * ), z (t) − x * ≥ 0 by first order optimality and the fact that z (t+1) − x * ≤ z (t) − x * (see (42)) in (44). The inequality in (45) follows by the smoothness of h. Further, we used Young's inequality † † † in (47). Finally, we used the PL-inequality (23) (which states that w(z (t+1) ) ≤ ∇h(z (t+1) ) 2 /2µ) and the fact that c ≥ ∇h(x * ) 2 in (48).…”
Section: Results For the A 2 Fw Algorithmmentioning
confidence: 99%
“…or prioritizing in-face steps [45], or theoretical results such as [46] and [47] show that FW variants must use active sets that containing the optimal solution after crossing a polytope dependent radius of convergence. These results, however, do not use combinatorial properties of previous minimizers or detect tight sets with provable guarantees and round to those.…”
Section: Logistic Lossmentioning
confidence: 99%
“…For this reason, FW methods fall into the category of projectionfree methods [64]. Furthermore, the method can be used to approximately solve quadratic subproblems in accelerated schemes, an approach usually referred to as conditional gradient sliding (see, e.g., [20,65]).…”
Section: Introductionmentioning
confidence: 99%