2012
DOI: 10.1162/evco_a_00069
|View full text |Cite
|
Sign up to set email alerts
|

Resampling Methods for Meta-Model Validation with Recommendations for Evolutionary Computation

Abstract: Meta-modeling has become a crucial tool in solving expensive optimization problems. Much of the work in the past has focused on finding a good regression method to model the fitness function. Examples include classical linear regression, splines, neural networks, Kriging and support vector regression. This paper specifically draws attention to the fact that assessing model accuracy is a crucial aspect in the meta-modeling framework. Resampling strategies such as cross-validation, subsampling, bootstrapping, an… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
124
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 154 publications
(124 citation statements)
references
References 57 publications
0
124
0
Order By: Relevance
“…It has been observed that the holdout estimator tends to be too pessimistic because only a proportion of the data is used to build the model (Bischl et al 2012). Correspondingly, a variation of the holdout method, which partially alleviates this biased behavior, consists of replicating the partition into training and test sets several times in different random ways; the classifier is trained and tested for each partition and the performances averaged to yield an overall estimate, which is generally more reliable.…”
Section: Data Splitting Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…It has been observed that the holdout estimator tends to be too pessimistic because only a proportion of the data is used to build the model (Bischl et al 2012). Correspondingly, a variation of the holdout method, which partially alleviates this biased behavior, consists of replicating the partition into training and test sets several times in different random ways; the classifier is trained and tested for each partition and the performances averaged to yield an overall estimate, which is generally more reliable.…”
Section: Data Splitting Methodsmentioning
confidence: 99%
“…With a large number of subsets, the estimator will be very accurate, but the variance will be large. Conversely, with a reduced number of subsets, the variance will be small, but the estimator will be largely biased (i.e, too conservative) (Bischl et al 2012). Although K = 5 and K = 10 are common choices that perform reasonably well for data sets of different sizes, it is worth noting that for very small data sets, a bigger value of K (or even the leave-one-out method) may become slightly preferable in order to train on as many examples as possible.…”
Section: Data Splitting Methodsmentioning
confidence: 99%
“…Simulations [7] may also be used to generate new data. A tool for the enrichment of data bases to fill data gaps is the imputation of missing data [31].…”
Section: Data Acquisition and Enrichmentmentioning
confidence: 99%
“…Predictive power is typically assessed by means of socalled resampling methods where the distribution of power characteristics is studied by artificially varying the subpopulation used to learn the model. Characteristics of such distributions can be used for model selection [7].…”
Section: Model Validation and Model Selectionmentioning
confidence: 99%
“…We may apply bootstrapping to estimate the distribution of these validation statistics; see Bischl et al (2012).…”
Section: Validation Of Metamodelsmentioning
confidence: 99%