2011
DOI: 10.1093/bioinformatics/btr597
|View full text |Cite
|
Sign up to set email alerts
|

MissForest—non-parametric missing value imputation for mixed-type data

Abstract: stekhoven@stat.math.ethz.ch; buhlmann@stat.math.ethz.ch

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

11
3,082
0
14

Year Published

2014
2014
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 4,080 publications
(3,107 citation statements)
references
References 20 publications
11
3,082
0
14
Order By: Relevance
“…Model 3 was further controlled for the effects of the co-parents' disorder(s). One hundred multiple imputations were performed using the MissForest procedure based on random forests (Stekhoven and Buhlmann, 2012) to adjust for missing data in coparents (24 diagnoses of SUD, mood and anxiety disorders and 111 diagnoses of behavioral disorders). In an additional model, we adjusted for disorder severity according to the GAF score.…”
Section: Discussionmentioning
confidence: 99%
“…Model 3 was further controlled for the effects of the co-parents' disorder(s). One hundred multiple imputations were performed using the MissForest procedure based on random forests (Stekhoven and Buhlmann, 2012) to adjust for missing data in coparents (24 diagnoses of SUD, mood and anxiety disorders and 111 diagnoses of behavioral disorders). In an additional model, we adjusted for disorder severity according to the GAF score.…”
Section: Discussionmentioning
confidence: 99%
“…We note that 41 of these accessions are also present in the set of genotypes that we analysed in our study (Supplementary Table 7). Scenario I: we removed 29 out of 199 accessions that did not contain data about at least 50 traits; the missing values for the remaining accessions were imputed for all traits by using the most recent robust imputation method suitable for mixed data types (that is, categorical and continuous) based on random forests via the missForest R package 69 ; Scenario II: we removed 29 out of 199 accessions that did not contain data about at least 50 traits; the missing values for the remaining accessions were imputed for all traits by using random forests imputation method; finally, only defence traits and developmental traits were used; Scenario III: like Scenario II but only for the 41 accession in the overlap between our set of genotypes and the population used in Atwell et al 53 ; Scenario IV: we removed 29 out of 199 accessions that did not contain data about at least 50 traits;…”
Section: Methodsmentioning
confidence: 99%
“…the missing values for the remaining accessions were imputed for only defence traits and developmental traits by using random forests imputation method via the missForest R package 69 ; and Scenario V: like Scenario IV but only for the 41 accession in the overlap between our set of genotypes and the population used in Atwell et al 53 . These five scenarios were necessary to control and investigate the effects of missing values and the way in which they were imputed.…”
Section: Methodsmentioning
confidence: 99%
“…In other words, the outcome variables are not assumed to have any particular distributional form (and nor are the errors), and thus can be normal, nonnormal continuous, ordinal, dichotomous, nominal, or count, among others. RFI is conducted using a seven-step process that is outlined in Stekhoven & Bühlmann (2011) paper.…”
mentioning
confidence: 99%