This article considers a measure of variable importance frequently used in variableselection methods based on decision trees and tree-based ensemble models. These models include CART, random forests, and gradient boosting machine. The measure of variable importance is defined as the total heterogeneity reduction produced by a given covariate on the response variable when the sample space is recursively partitioned. Despite its popularity, some authors have shown that this measure is biased to the extent that, under certain conditions, there may be dangerous effects on variable selection. Here we present a simple and effective method for bias correction, focusing on the easily generalizable case of the Gini index as a measure of heterogeneity. 611 612 M. SANDRI AND P. ZUCCOLOTTO (a) Evaluation of the reduction of (out-of-bag) predictive accuracy after a random permutation of the values assumed by X i ; and (b) the total heterogeneity reduction produced by X i on the response variable, obtained by adding up all the decreases of the heterogeneity index in the tree nodes where X i is selected for splitting. This article focuses on the class of VI measures described in (b), originally introduced by Breiman et al. (1984) in the context of CART. There are several influential theoretical investigations (Breiman 2001a; Friedman 2001) and many empirical applications (e.g., Friedman and Meulman 2003; Svetnik et al. 2005; Menze et al. 2007; De'ath 2007) of these measures in the literature. Much of this work centered on the original form of the measures introduced by Breiman et al. (1984). In addition, these measures are often set as the default in software for data mining, like the randomForest package in R (Breiman et al. 2006), the gbm package in R (Ridgeway 2007), the boost Stata command (Schonlau 2005), and the MART package in S-Plus and R (Friedman 2002). Some authors have shown that these VI measures are biased in a way that may have, under certain conditions, potentially dangerous effects on variable selection. Breiman et al. (1984) first noted that they are biased in favor of variables that have more values (i.e., fewer missing values, more categories, or distinct numerical values) and thus offer more splits. This means that variable selection may be affected by covariate characteristics other than information content. Subsequently, White and Liu (1994), Kononenko (1995), Dobra and Gehrke (2001), and Strobl (2005) investigated in greater detail the nature of the bias in information-based VI measures and elucidated the relation between bias and the covariate's number of values.When the Gini gain is used as the splitting criterion for the tree nodes, the resulting total heterogeneity reduction is called the "Gini VI measure." Strobl et al. (2007b) reinterpreted and systematized previous results about this measure and identified three fundamental sources of bias: (a) the bias of the Gini estimator, (b) the variance of the Gini estimator, and (c) the effects of multiple comparisons.Recently, several authors have proposed...