1998
DOI: 10.1007/3-540-49430-8_6
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Regularization in Neural Network Modeling

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
20
0
1

Year Published

1998
1998
2023
2023

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 33 publications
(21 citation statements)
references
References 30 publications
0
20
0
1
Order By: Relevance
“…In order to efficientize the search, previous work proposed to alternate the optimization of Λ and Θ, between consecutive full training runs [5,20], or on the fly [22,26]. Compared to grid-search, where Λ is fixed during a full training run, the on-the-fly adaptive methods in [22,26] adjusts Λ according to performance on validation sets every training step.…”
Section: Methodsmentioning
confidence: 99%
“…In order to efficientize the search, previous work proposed to alternate the optimization of Λ and Θ, between consecutive full training runs [5,20], or on the fly [22,26]. Compared to grid-search, where Λ is fixed during a full training run, the on-the-fly adaptive methods in [22,26] adjusts Λ according to performance on validation sets every training step.…”
Section: Methodsmentioning
confidence: 99%
“…To overcome this challenge, some methods have been proposed by scholars, such as criteria‐based model selection, early stopping, Bayesian regularization, and stacked generalization . In the present study, ANN with Bayesian regularization was chosen as the base ANN for surrogate modelling, given its stability.…”
Section: Fundamentals and Improvementsmentioning
confidence: 99%
“…However, if the ratio is not properly selected, it will cause under‐fitting or over‐fitting problems. To set the ratio automatically, Bayesian analysis technology is combined with the regularization of ANN . Bayesian optimization of the regularization parameters with the Gauss‐Newton approximation to the Hessian matrix is iterated to maximize posterior density (the subscript and superscript MP in later equations) until convergence .…”
Section: Fundamentals and Improvementsmentioning
confidence: 99%
“…Gradient-based hyperparameter learning algorithms have been proposed for a variety of supervised learning models, including neural networks (Larsen et al, 1996a;Andersen et al, 1997;Goutte & Larsen, 1998;Larsen et al, 1996b), support vector machines (Glasmachers & Igel, 2005;Keerthi et al, 2007;Chapelle et al, 2002), and more recently, conditional log-linear models (Do et al, 2008). However, these algorithms typically require complicated computations, making them cumbersome to implement.…”
Section: Introductionmentioning
confidence: 99%