2021
DOI: 10.48550/arxiv.2106.03212
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Towards an Understanding of Benign Overfitting in Neural Networks

Zhu Li,
Zhi-Hua Zhou,
Arthur Gretton

Abstract: Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss; yet surprisingly, they possess near-optimal prediction performance, contradicting classical learning theory. We examine how these benign overfitting phenomena occur in a two-layer neural network setting where sample covariates are corrupted with noise. We address the high dimensional regime, where the data dimension d grows with the number n of data points. Our analysis combines an up… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…The assumption that we have access to the exact form of the singular value decomposition of the feature matrix is only made trivialize the calculations, and is by no means essential. When the optimization algorithm is stopped before perfect matching can be achieve, the fitting from the weighted optimization scheme generated a smoother approximation (with a smaller p, according to (23), than what we would obtain with a regular least-squares minimization).…”
Section: Extension To General Feature Regressionmentioning
confidence: 98%
See 1 more Smart Citation
“…The assumption that we have access to the exact form of the singular value decomposition of the feature matrix is only made trivialize the calculations, and is by no means essential. When the optimization algorithm is stopped before perfect matching can be achieve, the fitting from the weighted optimization scheme generated a smoother approximation (with a smaller p, according to (23), than what we would obtain with a regular least-squares minimization).…”
Section: Extension To General Feature Regressionmentioning
confidence: 98%
“…In this section, we study the more realistic setting of training with noisy data. The common practice is that when we solve the minimization problem in such a case, we should avoid overfitting the model to the data by stopping the optimization algorithm early at an appropriate level of values depending on the noise level in the data, but see [3,23] for some analysis in the direction of "benign overfitting". When training with noisy data, the impact of the weight matrices Λ −α…”
Section: Error Bounds For Training With Noisy Datamentioning
confidence: 99%
“…While they offered important insights into the benign overfitting phenomenon, most of them are limited to the settings of linear models (Belkin et al, 2019b;Bartlett et al, 2020;Hastie et al, 2019;Wu and Xu, 2020;Chatterji and Long, 2020;Zou et al, 2021b;Cao et al, 2021) and kernel/random features models (Belkin et al, 2018;Liang and Rakhlin, 2020;Montanari and Zhong, 2020), and cannot be applied to neural network models that are of greater interest. The only notable exceptions are (Adlam and Pennington, 2020;Li et al, 2021), which attempted to understand benign overfitting in neural network models. However, they are still limited to the "neural tagent kernel regime" (Jacot et al, 2018) where the neural network learning problem is essentially equivalent to kernel regression.…”
Section: Introductionmentioning
confidence: 99%
“…Contrary to conventional statistical wisdom, overparameterization turns out to be a rather desirable property for neural networks. For instance, phenomena such as double descent [BHMM19, SGd + 19, NKB + 20] and benign overfitting [BLLT20, LZG21,BMR21] suggest that understanding the generalization properties of overparameterized models lies beyond the scope of the usual control of capacity via the size of the parameter set [NTS15].…”
Section: Introductionmentioning
confidence: 99%