2019
DOI: 10.1101/785519
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Selecting the most important self-assessed features for predicting conversion to Mild Cognitive Impairment with Random Forest and Permutation-based methods

Abstract: Alzheimer's Disease (AD) is a complex, multifactorial and comorbid condition. The asymptomatic behavior in the early stages makes the identification of the disease onset particularly challenging. Mild cognitive impairment (MCI) is an intermediary stage between the expected decline of normal aging and the pathological decline associated with dementia. The identification of risk factors for MCI is thus sorely needed. Self-reported personal information such as age, education, income level, sleep, diet, physical e… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
22
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 17 publications
(22 citation statements)
references
References 29 publications
(15 reference statements)
0
22
0
Order By: Relevance
“…It performs better for large, high-dimensional data sets and is more robust to noise and feature selection [34,[61][62][63]65]. Additionally, the prediction ability of Random Forest is resistant to the multicollinearity of the driven variables [66][67][68], so all the all the variables in Equation (2) were used to predict the spatially continuous TanSat SIF. The main parameters used in RF are the numbers of input prediction variables and decision trees.…”
Section: Random Forest Approach For Sif Modelingmentioning
confidence: 99%
“…It performs better for large, high-dimensional data sets and is more robust to noise and feature selection [34,[61][62][63]65]. Additionally, the prediction ability of Random Forest is resistant to the multicollinearity of the driven variables [66][67][68], so all the all the variables in Equation (2) were used to predict the spatially continuous TanSat SIF. The main parameters used in RF are the numbers of input prediction variables and decision trees.…”
Section: Random Forest Approach For Sif Modelingmentioning
confidence: 99%
“…Although there is a study using random forest and permutation-based methods to select important variables for predicting conversion to MCI ( Gómez-Ramírez et al, 2019 ), to the best of our knowledge, this is the first study investigating the predictive rather than associative value of the odors in the OI test for incident dementia in the elderly. There are several strengths in our study.…”
Section: Discussionmentioning
confidence: 99%
“…To mask the information on a variable during validation, instead of removing the variable from the data set, the PI method replaces it with random noise by shuffling the values of the variable, i.e., using values from other participants ( Breiman, 2001 ; Fisher et al, 2019 ). The relative importance of a variable was calculated as the accuracy decrease of the variable relative to the range of the accuracy decreases of all the variables ( Gómez-Ramírez et al, 2019 ).…”
Section: Methodsmentioning
confidence: 99%
“…Variables’ importance was evaluated using both the permutation importance (PI) by the multivariate logistic regression (LR) model and Gini importance (GI) by the random forest (RF) model (27-29) (eMethods 1). We used the K-fold cross-validation method to calculate the predictive ability indices (30) (eMethods 2), including accuracy, sensitivity, specificity, and area under the receiver operating characteristic (ROC) curve (31).…”
Section: Methodsmentioning
confidence: 99%