2008
DOI: 10.1186/1471-2105-9-307
|View full text |Cite
|
Sign up to set email alerts
|

Conditional variable importance for random forests

Abstract: Background: Random forests are becoming increasingly popular in many scientific fields because they can cope with "small n large p" problems, complex interactions and even highly correlated predictor variables. Their variable importance measures have recently been suggested as screening tools for, e.g., gene expression studies. However, these variable importance measures show a bias towards correlated predictor variables.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

12
2,041
1
13

Year Published

2011
2011
2018
2018

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 2,599 publications
(2,159 citation statements)
references
References 29 publications
12
2,041
1
13
Order By: Relevance
“…Later, Strobl et al [40] pointed out another issue of RF variable importance which shows a bias towards correlated predictor variables. The issue of correlated feature variables happens commonly in high-dimensional bioinformatics tasks, e.g.…”
Section: Revised Rf Feature Importancementioning
confidence: 99%
See 2 more Smart Citations
“…Later, Strobl et al [40] pointed out another issue of RF variable importance which shows a bias towards correlated predictor variables. The issue of correlated feature variables happens commonly in high-dimensional bioinformatics tasks, e.g.…”
Section: Revised Rf Feature Importancementioning
confidence: 99%
“…genomics. This paper [40] developed a conditional permutation scheme which used the partition automatically provided by the fitted model as a conditioning grid. The resulting measure was claimed to reflect the true impact of each predictor (variable) better than the original, marginal approach.…”
Section: Revised Rf Feature Importancementioning
confidence: 99%
See 1 more Smart Citation
“…Strobl et al [2008] suggested using a conditional permutation scheme to calculate VI. The variables correlated with the variable of interest are empirically determined, and then the partitions in the individual tree are utilized to permute the variable of interest within blocks.…”
Section: Other Variable Importance Measuresmentioning
confidence: 99%
“…Here, the environmental variables are strongly covariant and the model contains more than one variable type (all continuous except for morphotype, which is categorical). To account for this and aid in interpretation of the rankings, an 170 unbiased, conditional variable importance ranking method was incorporated via the party package in R, which disentangles the most important variable from the model (Strobl et al, 2008). This method examines whether a correlation between the response variable and a predictor is conditional on another variable proceeding it in the tree, thereby identifying the most influential variable and demoting others (Strobl et al, 2008).…”
Section: Methodsmentioning
confidence: 99%