2021
DOI: 10.1017/s0962492921000027
|View full text |Cite
|
Sign up to set email alerts
|

Deep learning: a statistical viewpoint

Abstract: The remarkable practical success of deep learning has revealed some major surprises from a theoretical perspective. In particular, simple gradient methods easily find near-optimal solutions to non-convex optimization problems, and despite giving a near-perfect fit to training data without any explicit effort to control model complexity, these methods exhibit excellent predictive accuracy. We conjecture that specific principles underlie these phenomena: that overparametrization allows gradient methods to find i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
60
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 110 publications
(62 citation statements)
references
References 85 publications
1
60
1
Order By: Relevance
“…Despite the tile-level false positives, the quantitative measures have shown excellent predictive accuracy, robust performance and generalization on both cohorts. This may be associated with interpolation and generalization capabilities of overparameterized deep learning systems leading to benign overfitting, as demonstrated by Bartlett et al in their latest findings (23).…”
Section: Discussionmentioning
confidence: 91%
See 1 more Smart Citation
“…Despite the tile-level false positives, the quantitative measures have shown excellent predictive accuracy, robust performance and generalization on both cohorts. This may be associated with interpolation and generalization capabilities of overparameterized deep learning systems leading to benign overfitting, as demonstrated by Bartlett et al in their latest findings (23).…”
Section: Discussionmentioning
confidence: 91%
“…This may be associated with interpolation and generalization capabilities of overparameterized deep learning systems leading to benign overfitting , as demonstrated by Bartlett et al . in their latest findings (23).…”
Section: Discussionmentioning
confidence: 94%
“…The activation of one neuron can generate a data analysis result, and many neurons are connected to form a complete NN model and output the analysis result of the complete data. The NN technology was first produced in the 1950s and 1960s, and the simplest NN technology is the perceptron, which is essentially a feedforward NN structure, and this NN structure is also relatively common [ 9 ]. The perceptron consists of an input layer, an output layer, and a hidden layer.…”
Section: DL and Security Evaluation Of Enterprisesmentioning
confidence: 99%
“…Several recent works have investigated the nature of modern Deep Neural Networks (DNNs) past the point of zero training error (Belkin, 2021;Nakkiran et al, 2020;Bartlett et al, 2021;Power et al, 2022). The stage at which the training error reaches zero is called the Interpolation Threshold (IT), since at this point, the learned network function interpolates between training samples.…”
Section: Introductionmentioning
confidence: 99%