1995
DOI: 10.1162/neco.1995.7.1.108
|View full text |Cite
|
Sign up to set email alerts
|

Training with Noise is Equivalent to Tikhonov Regularization

Abstract: It is well known that the addition of noise to the input data of a neural network during training can, in some circumstances, lead to significant improvements in generalization performance. Previous work has shown that such training with noise is equivalent to a form of regularization in which an extra term is added to the error function. However, the regularization term, which involves second derivatives of the error function, is not bounded below, and so can lead to difficulties if used directly in a learnin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

7
581
2
4

Year Published

2002
2002
2024
2024

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 1,017 publications
(594 citation statements)
references
References 11 publications
(11 reference statements)
7
581
2
4
Order By: Relevance
“…However intuitively, training with "inconsistent" data can be understood as some extra regularization by noise. 24 The results on data set (1) reported in Table 3 vary considerably, with error rates of 13.6% for the Linear Programming Machine, which is our worst result, but is still a relative improvement of 24% over the result reported in ref 14 and of 31% over the result from ref 12 (note, however, that the class priors are different in ref 12). Our best result is 6.8% error from a RBF-SVM of kernel width σ 2 ) 5, a relative improvement of 62% and 66% compared to the errors reported in refs 14 and 12, respectively.…”
Section: Comparison With Prior Resultsmentioning
confidence: 99%
“…However intuitively, training with "inconsistent" data can be understood as some extra regularization by noise. 24 The results on data set (1) reported in Table 3 vary considerably, with error rates of 13.6% for the Linear Programming Machine, which is our worst result, but is still a relative improvement of 24% over the result reported in ref 14 and of 31% over the result from ref 12 (note, however, that the class priors are different in ref 12). Our best result is 6.8% error from a RBF-SVM of kernel width σ 2 ) 5, a relative improvement of 62% and 66% compared to the errors reported in refs 14 and 12, respectively.…”
Section: Comparison With Prior Resultsmentioning
confidence: 99%
“…It creates synthetic entities of the minority class during the model training phase to regularize the prediction models to avoid overfitting and to learn structures representing minority entities. In many ways, SMOTE resembles distortion-based model regularization techniques [34,35]. In this section, we will shortly study the adaptation of the algorithm for LTV prediction.…”
Section: Imbalance In Behavioral Datasets and Synthetic Minority Overmentioning
confidence: 99%
“…(IV) A publicly calculated linear approximation x ′ Bu − γ is computed by some standard method such as a 1-norm error minimization together with a Tikhonov regularization term ν u 1 [17,2] as follows: min (u,γ,s)…”
Section: Privacy-preserving Linear Kernel Approximationmentioning
confidence: 99%