In this paper a novel method is proposed to combine decision tree classifiers using calculated classification confidence values. This confidence in the classification is based on distance calculation to the relevant decision boundary. It is shown that these values-provided by individual classification trees-can be integrated to derive a consensus decision. The proposed combination scheme-confidence weighted majority votingpossesses attractive features compared to other approaches. There is no need for an auxiliary combiner or gating network, like in the Mixture of Experts structure and the method is not limited to decision trees with axis-parallel splits; it is applicable to any kind of classifiers that use hyperplanes to cluster the input space.
In this paper a novel method is proposed that extends the decision tree framework, allowing standard decision tree classifiers to provide a unique certainty value for every input sample they classify. This value is calculated for every input sample individually and represents the classifier's certainty in the classification.The algorithm consists of three main parts.1) The input sample's distance is calculated to the decision boundary. This step involves solving a set of linearly constrained quadratic programs. The distance calculating procedure also allows the use of different distance metrics, where the minimal distance projection is not necessarily invariant.2) Kernel density estimation is done on the distance values of a training set to obtain conditional true and false classification profiles.3) Using the conditional densities Bayesian computation is applied to calculate the conditional true classification probability, which we use as classification certainty.The algorithm proposed in this paper is not limited to axis parallel trees, it can be applied to any kind of decision tree where the decisions are hyperplanes (not necessarily parallel to the axes). The algorithm does not alter the tree structure, the growth process is not modified. It only uses the training data to obtain true and false classification profiles conditional to distance from the decision boundary.The usability of the method is demonstrated on two examples. One artificial two dimensional dataset, and one real world nine dimensional dataset. It is shown that the method can significantly increase the classification accuracy (in the cost of rejecting a certain number of samples, saying their classification would be too "risky"). It is also demonstrated that the classification certainty value can be effectively used for ranking purposes.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.