2017
DOI: 10.1101/172395
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Stacked Generalization: An Introduction to Super Learning

Abstract: Stacked generalization is an ensemble method that allows researchers to combine several different prediction algorithms into one. Since its introduction in the early 1990s, the method has evolved several times into what is now known as “Super Learner”. Super Learner uses V -fold cross-validation to build the optimal weighted combination of predictions from a library of candidate algorithms. Optimality is defined by a user-specified objective function, such as minimizing mean squared error or maximizing the are… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(12 citation statements)
references
References 27 publications
0
12
0
Order By: Relevance
“…Because of this variability in performance, which by itself provides an assessment of the different ML algorithms, we then combined them into the SLA. The more sophisticated SLA demonstrated superiority over the other singularly used ML algorithms; the SLA has also been shown in other analyses to be the optimal tool for constructing such predictive models [27,28]. Such an approach may additionally be more efficient when the number of covariates is large for assessing the multiple covariate interactions and correlation terms than simpler statistical approaches.…”
Section: Discussionmentioning
confidence: 97%
“…Because of this variability in performance, which by itself provides an assessment of the different ML algorithms, we then combined them into the SLA. The more sophisticated SLA demonstrated superiority over the other singularly used ML algorithms; the SLA has also been shown in other analyses to be the optimal tool for constructing such predictive models [27,28]. Such an approach may additionally be more efficient when the number of covariates is large for assessing the multiple covariate interactions and correlation terms than simpler statistical approaches.…”
Section: Discussionmentioning
confidence: 97%
“…For each outcome, 20 imputations were performed and TMLE was performed on all 20 imputed datasets; estimates were then pooled across models. TMLE was performed using a cross-validated ensemble machine learning procedure (the Super Learner) 31 that fits the best-weighted combination of different models; candidate models included generalized linear models, generalized additive models, and multivariate adaptive regression splines. Models adjusted for patient gender, age at the baseline visit, baseline hearing ability, and annual follow-up visit audiometric outcomes.…”
Section: Discussionmentioning
confidence: 99%
“…e Super Learner algorithm is an evolution of the Stacking algorithm. Previous research results indicate that the Super Learner integrated model proposed by Van der Laan et al can independently select the base classifier according to the data structure of the dataset and the performance of the classifier, so as to improve the classification performance and robustness of the model [38][39][40][41][42][43][44]. Besides, if none of the candidate models in the Super Learner's classifier algorithm library can achieve the prespecified accuracy, the performance of the Super Learner is at least as good as the best algorithm in the candidate algorithm library, or it gradually approaches the best algorithm.…”
Section: Heterogeneous Ensemblementioning
confidence: 99%
“…is paper uses the Z-score standardization method to standardize the data [39,41]. e Z-score standardization method is shown in…”
Section: Data Preprocessingmentioning
confidence: 99%