2020
DOI: 10.1162/neco_a_01276
|View full text |Cite
|
Sign up to set email alerts
|

The Stochastic Delta Rule: Faster and More Accurate Deep Learning Through Adaptive Weight Noise

Abstract: Multilayer neural networks have led to remarkable performance on many kinds of benchmark tasks in text, speech, and image processing. Nonlinear parameter estimation in hierarchical models is known to be subject to overfitting and misspecification. One approach to these estimation and related problems (e.g., saddle points, colinearity, feature discovery) is called Dropout. The Dropout algorithm removes hidden units according to a binomial random variable with probability [Formula: see text] prior to each update… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 11 publications
(4 citation statements)
references
References 6 publications
0
4
0
Order By: Relevance
“…We note that in machine learning, noise processes such as dropout and stochastic regularization (e.g., refs. 3032) can be applied to improve generalization from finite training data. Intrinsic synaptic noise is qualitatively different from these regularization processes.…”
Section: Resultsmentioning
confidence: 99%
“…We note that in machine learning, noise processes such as dropout and stochastic regularization (e.g., refs. 3032) can be applied to improve generalization from finite training data. Intrinsic synaptic noise is qualitatively different from these regularization processes.…”
Section: Resultsmentioning
confidence: 99%
“…In the past, various CNN architectures and modifications have been developed, and some show a high classification accuracy in the ImageNet dataset [ 49 , 50 , 51 ]. However, as the newer CNN architectures were not (yet) implemented in the software that was used here, we choose CNN architectures that were previously used to classify image data and were available in our software.…”
Section: Discussionmentioning
confidence: 99%
“…In order to validate the denoising effectiveness of the algorithms proposed in this article, several traditional and literature algorithms are selected for comparison ( Chen & Han, 2005 ; Hu et al, 2021 ; Deng et al, 2020 ; Frazier-Logue & José Hanson, 2020 ). The simulation process involves collecting data for each algorithm and comparing them.…”
Section: Simulationmentioning
confidence: 99%