2022
DOI: 10.48550/arxiv.2201.12293
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Understanding Why Generalized Reweighting Does Not Improve Over ERM

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…Reweighting Does Not Improve Over ERM [Zhai et al, 2022] Summary of the Paper: The paper studies the performance of generalized reweighting (GRW) algorithms, a class of methods that aim to address distributional shift in machine learning tasks. The authors first prove that for linear models and sufficiently wide fully-connected neural networks, the implicit bias of GRW is equivalent to empirical risk minimization (ERM) when trained for an infinitely long time, and that regularization must be large enough to significantly lower the training performance in order to affect this implicit bias.…”
Section: Understanding Why Generalizedmentioning
confidence: 99%
“…Reweighting Does Not Improve Over ERM [Zhai et al, 2022] Summary of the Paper: The paper studies the performance of generalized reweighting (GRW) algorithms, a class of methods that aim to address distributional shift in machine learning tasks. The authors first prove that for linear models and sufficiently wide fully-connected neural networks, the implicit bias of GRW is equivalent to empirical risk minimization (ERM) when trained for an infinitely long time, and that regularization must be large enough to significantly lower the training performance in order to affect this implicit bias.…”
Section: Understanding Why Generalizedmentioning
confidence: 99%
“…Empirically, prior works [10,50] recognize that various IW methods tend to exacerbate overfitting, which leads to a diminishing effect on stochastic gradient descent (SGD) over training epochs especially when they are applied to over-parameterized neural networks (NNs). Theoretically, previous studies prove that for over-parameterized neural networks, re-weighting algorithms do not improve over ERM because their implicit biases are (almost) equivalent [51,58,62]. In addition, some prior works also point out that using conventional regularization techniques such as weight decay cannot significantly improve the performance of IW [50].…”
Section: Introductionmentioning
confidence: 99%