2017
DOI: 10.48550/arxiv.1712.06541
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Size-Independent Sample Complexity of Neural Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

7
94
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 52 publications
(102 citation statements)
references
References 0 publications
7
94
1
Order By: Relevance
“…It is well known that the VC-dimension of neural networks is at least linear in the number of parameters (Bartlett et al, 2017b), and therefore classical VC theory cannot explain the generalization ability of modern neural networks with more parameters than training samples. Researchers have proposed norm-based generalization bounds (Bartlett & Mendelson, 2002;Bartlett et al, 2017a;Neyshabur et al, 2015Neyshabur et al, , 2017Neyshabur et al, , 2019Konstantinos et al, 2017;Golowich et al, 2017;Li et al, 2018a) and compression-based bounds (Arora et al, 2018). Dziugaite & Roy (2017); Zhou et al (2019) used the PAC-Bayes approach to compute non-vacuous generalization bounds for MNIST and ImageNet, respectively.…”
Section: Related Workmentioning
confidence: 99%
“…It is well known that the VC-dimension of neural networks is at least linear in the number of parameters (Bartlett et al, 2017b), and therefore classical VC theory cannot explain the generalization ability of modern neural networks with more parameters than training samples. Researchers have proposed norm-based generalization bounds (Bartlett & Mendelson, 2002;Bartlett et al, 2017a;Neyshabur et al, 2015Neyshabur et al, , 2017Neyshabur et al, , 2019Konstantinos et al, 2017;Golowich et al, 2017;Li et al, 2018a) and compression-based bounds (Arora et al, 2018). Dziugaite & Roy (2017); Zhou et al (2019) used the PAC-Bayes approach to compute non-vacuous generalization bounds for MNIST and ImageNet, respectively.…”
Section: Related Workmentioning
confidence: 99%
“…ticularly, one open question looms large from the recent literature: how to theoretically ensure the generalization performance of the NNs when it is trained with finitely many samples. To address this question, some significant advances have been reported by Bartlett et al (2017), Golowich et al (2017), Neyshabur et al (2015Neyshabur et al ( , 2017Neyshabur et al ( , 2018, and Arora et al (2018). For most of the existing results, the generalization error depends polynomially in the dimensionality (number of weights).…”
Section: Regularized Neural Networkmentioning
confidence: 99%
“…For example, Graves et al (2013) report that after training with merely 462 speech samples, deep LSTM RNNs achieve a test set error of 17.7% on TIMIT phoneme recognition benchmark, which is the best recorded score. Despite of the popularity of RNNs in applications, their theory is less studied than other feedforward neural networks (Haussler, 1992;Bartlett et al, 2017;Neyshabur et al, 2017;Golowich et al, 2017;Li et al, 2018). There are still several long lasting fundamental questions regarding the approximation, trainability, and generalization of RNNs.…”
Section: Introductionmentioning
confidence: 99%