2019
DOI: 10.48550/arxiv.1902.03046
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Beyond Least-Squares: Fast Rates for Regularized Empirical Risk Minimization through Self-Concordance

Abstract: We consider learning methods based on the regularization of a convex empirical risk by a squared Hilbertian norm, a setting that includes linear predictors and non-linear predictors through positive-definite kernels. In order to go beyond the generic analysis leading to convergence rates of the excess risk as O(1/ √ n) from n observations, we assume that the individual losses are self-concordant, that is, their third-order derivatives are bounded by their secondorder derivatives. This setting includes least-sq… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 7 publications
(15 citation statements)
references
References 19 publications
0
15
0
Order By: Relevance
“…Statistical analysis using (generalized) self-concordant (SC) functions as a loss function is gaining increasing attention in the machine learning community [1,23,30,31]. This class of loss functions allows to obtain faster statistical rates akin to least-squares.…”
Section: Introductionmentioning
confidence: 99%
“…Statistical analysis using (generalized) self-concordant (SC) functions as a loss function is gaining increasing attention in the machine learning community [1,23,30,31]. This class of loss functions allows to obtain faster statistical rates akin to least-squares.…”
Section: Introductionmentioning
confidence: 99%
“…In this paper, we introduce a new practical improper algorithm, that we call AIOLI (Algorithmic efficient Improper Online LogIstic regression), for online logistic regression. The latter is based on Follow The Regularized Leader (FTRL) [McMahan, 2011] with surrogate losses. AIOLI takes inspiration from the Azoury-Warmuth-Vovk forecaster (also named non-linear Ridge regression or AWV) from [Vovk, 2001] and [Azoury and Warmuth, 2001] which adds a non-proper penalty based on the next input x t and from Online Newton Step [Hazan et al, 2007] which leverage the exp-concavity of logistic regression to achieve logarithmic regret.…”
Section: Contributionsmentioning
confidence: 99%
“…Note that the effective dimension is always upper-bounded by d eff (λ) n/λ, providing in the worst case, the regret upperbound of order O(B √ n) for well-chosen λ. Under the capacity condition, which is a classical assumption for kernels (see [Marteau-Ferey et al, 2019] for instance), better bounds on the effective dimension are provided which yield to faster regret rates.…”
Section: Non-parametric Settingmentioning
confidence: 99%
See 2 more Smart Citations