2011
DOI: 10.1007/978-3-642-24412-4_11
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive and Optimal Online Linear Regression on ℓ1-Balls

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
15
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
2

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(15 citation statements)
references
References 14 publications
0
15
0
Order By: Relevance
“…In particular, the EG ± algorithm Kivinen and Warmuth (1997) uses an exponential update rule to formulate an online linear regression algorithm which performs comparably to the best linear predictor under sparsity assumptions. The adaptive EG ± algorithm Gerchinovitz and Yu (2011) further proposes a parameter-free version of EG ± where the learning rate η t is updated in an adaptive fashion, and is a decreasing function of time step t.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In particular, the EG ± algorithm Kivinen and Warmuth (1997) uses an exponential update rule to formulate an online linear regression algorithm which performs comparably to the best linear predictor under sparsity assumptions. The adaptive EG ± algorithm Gerchinovitz and Yu (2011) further proposes a parameter-free version of EG ± where the learning rate η t is updated in an adaptive fashion, and is a decreasing function of time step t.…”
Section: Related Workmentioning
confidence: 99%
“…We repeat this process until the stopping criteria are met. Note that we use the anytime learning rate schedule of Gerchinovitz and Yu (2011), which is a decreasing function of time t (see Appendix C for more details). A summary of the proposed algorithm, which we refer to as Combinatorial Optimization with Monomial Experts (COMEX), is given in Algorithm 1.…”
Section: The Comex Algorithmmentioning
confidence: 99%
See 1 more Smart Citation
“…We assume that the set of outcomes Y is a bounded set, a restriction that can be removed by standard truncation arguments (see e.g. [12]). Let X be some set of covariates, and let F be a class of functions X → Y for some Y ⊆ R. Recall the protocol of the online prediction problem: On each round t ∈ {1, .…”
Section: Assumptions and Definitionsmentioning
confidence: 99%
“…Remark 3 In Theorem 2 above, we assumed that the observations y t and the predictions f (x t ) are all bounded by B, and that B is known in advance by the forecaster. We can actually remove this requirement by using adaptive techniques of Gerchinovitz and Yu (2014), namely, adaptive clipping of the intermediate predictions f t,j (x t ) and adaptive Lipschitzification of the square loss functions Remark 4 Even in the case when B is known by the forecaster, the clipping and Lipschitzification techniques of Gerchinovitz and Yu (2014) can be useful to get smaller constants in the regret bound. We could indeed replace the constants 50 and 120 with 8 and 48 respectively.…”
Section: The Chaining Exponentially Weighted Average Forecastermentioning
confidence: 99%