2020
DOI: 10.1101/2020.07.08.193664
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bootstrap aggregating improves the generalizability of Connectome Predictive Modelling

Abstract: It is a long-standing goal of neuroimaging to produce reliable generalized models of brain behavior relationships. More recently data driven predicative models have become popular. Overfitting is a common problem with statistical models, which impedes model generalization. Cross validation (CV) is often used to give more balanced estimates of performance. However, CV does not provide guidance on how best to apply the models generated out-of-sample. As a solution, this study proposes an ensemble learnin… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(5 citation statements)
references
References 51 publications
0
5
0
Order By: Relevance
“…As the test set resting-state scan contained only 200 time points, more time points may be needed for connectivity matrices to have sufficient variation across individuals in order to accurately predict complex phenotypes such as CR. Advanced modelling techniques, such as bootstrap aggregating (O'Connor et al, 2020) and partial least squares regression (Yoo et al, 2018), when implemented within CPM frameworks have also been shown to improve generalizability to external datasets.…”
Section: Discussionmentioning
confidence: 99%
“…As the test set resting-state scan contained only 200 time points, more time points may be needed for connectivity matrices to have sufficient variation across individuals in order to accurately predict complex phenotypes such as CR. Advanced modelling techniques, such as bootstrap aggregating (O'Connor et al, 2020) and partial least squares regression (Yoo et al, 2018), when implemented within CPM frameworks have also been shown to improve generalizability to external datasets.…”
Section: Discussionmentioning
confidence: 99%
“…The linear models from all 100 iterations were then combined into a single model following a bootstrap aggregating (“bagging”) approach that has been shown to reduce overfitting and improve model accuracy (Fig. 1B, step 5; O’Connor et al, 2020). This bagged model was then used to predict Day 2 gISC from Day 2 RSFC (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…Because the HCP 7T dataset is composed of data from individuals of varying degrees of genetic relatedness (monozygotic and dizygotic twins, non-twin siblings, and unrelated individuals; 93 unique families), all individuals from the same family were randomly assigned to one of two groups of 88 (i.e., split-half cross-validation), with one group being used to train a model that would then be tested on the other (and vice versa). The following approach was then applied to 100 of these random splits of the data to assess the performance of rCPM across different training/testing sets and to build a bagged model that is more robust to overfitting (O’Connor et al, 2020). Calculate leave-one-out (LOO) ISC according to the method described in section 2.4 separately for each group of 88 subjects.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…As the test set resting-state scan contained only 200 time points, more time points may be needed for connectivity matrices to have sufficient variation across individuals in order to accurately predict complex phenotypes such as CR. Advanced modeling techniques, such as bootstrap aggregating (O’Connor et al, 2020) and partial least squares regression (Yoo et al, 2018), when implemented within CPM frameworks have also been shown to improve generalizability to external datasets.…”
Section: Discussionmentioning
confidence: 99%