2021
DOI: 10.48550/arxiv.2103.12024
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stability and Deviation Optimal Risk Bounds with Convergence Rate $O(1/n)$

Abstract: The sharpest known high probability generalization bounds for uniformly stable algorithms (Feldman, Vondrák, NeurIPS 2018, COLT, 2019, (Bousquet, Klochkov, Zhivotovskiy, COLT, 2020) contain a generally inevitable sampling error term of order Θ(1/ √ n). When applied to excess risk bounds, this leads to suboptimal results in several standard stochastic convex optimization problems. We show that if the so-called Bernstein condition is satisfied, the term Θ(1/ √ n) can be avoided, and high probability excess risk … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…In recent years Hardt et al [6] first showed uniform stability final-iterate bounds for vanilla Stochastic Gradient Descent (SGD). More recent works develop alternative generalization error bounds based on high-probability analysis [7][8][9][10] and data-dependent variants [11], or under weaker assumptions such as as strongly quasi-convex [12], non-smooth convex [13][14][15][16], and pairwise losses [17,18]. In the nonconvex case, [19] provide bounds that involve on-average variance of the stochastic gradients.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years Hardt et al [6] first showed uniform stability final-iterate bounds for vanilla Stochastic Gradient Descent (SGD). More recent works develop alternative generalization error bounds based on high-probability analysis [7][8][9][10] and data-dependent variants [11], or under weaker assumptions such as as strongly quasi-convex [12], non-smooth convex [13][14][15][16], and pairwise losses [17,18]. In the nonconvex case, [19] provide bounds that involve on-average variance of the stochastic gradients.…”
Section: Introductionmentioning
confidence: 99%
“…In close relation to our paper, Hardt et al [1] first showed uniform stability final-iterate bounds for vanilla SGD. More recent works develop alternative generalization error bounds based on high-probability analysis [38][39][40][41] and data-dependent variants [42], or under different assumptions than those of prior works such as as strongly quasi-convex [43], non-smooth convex [44][45][46][47], and pairwise losses [48,49]. In the nonconvex case, [50] provide bounds that involve on-average variance of the stochastic gradients.…”
Section: Introductionmentioning
confidence: 99%