2021
DOI: 10.48550/arxiv.2112.04660
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Fully Single Loop Algorithm for Bilevel Optimization without Hessian Inverse

Abstract: In this paper, we propose a new Hessian inverse free Fully Single Loop Algorithm (FSLA) for bilevel optimization problems. Classic algorithms for bilevel optimization admit a double loop structure which is computationally expensive. Recently, several single loop algorithms have been proposed with optimizing the inner and outer variable alternatively. However, these algorithms not yet achieve fully single loop. As they overlook the loop needed to evaluate the hyper-gradient for a given inner and outer state. In… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
4

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(7 citation statements)
references
References 22 publications
0
7
0
Order By: Relevance
“…solve the inner problem with one step per hyper-iteration. [19,22,26,25,28,6,63,24,32]. Meanwhile, there are also works utilizing other strategies like penalty methods [42], and also other formulations like the case where the inner problem has non-unique minimizers [31].…”
Section: Related Workmentioning
confidence: 99%
“…solve the inner problem with one step per hyper-iteration. [19,22,26,25,28,6,63,24,32]. Meanwhile, there are also works utilizing other strategies like penalty methods [42], and also other formulations like the case where the inner problem has non-unique minimizers [31].…”
Section: Related Workmentioning
confidence: 99%
“…2021b) stepsizes, or by employing larger and complexity dependent mini-batches (Ji et al, 2021). Warm-starting also the LS can further improve the sample-complexity to O(ǫ −2 ) (Arbel and Mairal, 2021;Li et al, 2021). The complexity O(ǫ −2 ) is optimal, since the optimal sample complexity of methods using unbiased stochastic gradient oracles with bounded Table 1: Sample complexity (SC) of stochastic bilevel optimization methods for finding an ǫstationary point of Problem 2 with LL of type 3.…”
Section: Ls Solvermentioning
confidence: 99%
“…A common procedure to improve the overall performance of bilevel algorithms is that of using the LL (or LS) approximate solution found at the (s − 1)-th UL iteration as a starting point for the LL (or LS) solver at the s-th UL iteration. This strategy, which is called warm-start, reduces the number of LL (or LS) iterations needed by the bilevel procedure and is thought to be fundamental to achieve the optimal sample complexity (Arbel and Mairal, 2021;Li et al, 2021). Furthermore, in the stochastic setting (2), warm-start is often accompanied by the use of large mini-batches, i.e.…”
Section: Introductionmentioning
confidence: 99%
“…Based on different approaches to the estimation of hypergradient, these methods are divided into two categories, i.e. Approximate Implicit Differentiation (AID) [11,14,15,18,22,26,58] and Iterative Differentiation (ITD) [9,10,34,42]. ITD methods first solve the lower level problem approximately and then calculate the hypergradient with backward (forward) automatic differentiation, while AID methods approximate the exact hypergradient [11,19,30,33].…”
Section: Related Workmentioning
confidence: 99%
“…Bilevel optimization problems [44,48,56] involve two levels of problems: an inner problem and an outer problem. Efficient gradient-based alternative update algorithms [15,18,26] have recently been proposed to solve non-distributed bilevel problems, but efficient algorithms designed for the FL setting have not yet been shown. In fact, the most challenging step is to evaluate the hypergradient (gradient w.r.t the variable of the outer problem).…”
Section: Introductionmentioning
confidence: 99%