A Comparison of Decision Tree Ensemble Creation Techniques

Banfield, Mark J.; Hall, '; Bowyer, Kevin W.; Kegelmeyer,

doi:10.1109/tpami.2007.250609

Cited by 386 publications

(212 citation statements)

References 12 publications

Supporting

Mentioning

207

Contrasting

Unclassified

Order By: Relevance

“…In turn, neural networks are less accurate than the boosting type of classifier aggregation, as shown by Alfaro et al [8]. Furthermore, the Random Forest classifier [6] has been shown to have good accuracy compared to other classifiers [9] [10]. Therefore, we use boosting (specifically, Freund and Schapire's Adaboost algorithm [5]) and Random # Trace Forests for care product classification.…”

Section: Classifier Algorithmsmentioning

confidence: 99%

Process Prediction in Noisy Data Sets: A Case Study in a Dutch Hospital

Spoel

Keulen

Amrit

2013

Lecture Notes in Business Information Processing

View full text Add to dashboard Cite

Abstract. Predicting the amount of money that can be claimed is critical to the effective running of an Hospital. In this paper we describe a case study of a Dutch Hospital where we use process mining to predict the cash flow of the Hospital. In order to predict the cost of a treatment, we use different data mining techniques to predict the sequence of treatments administered, the duration and the final "care product" or diagnosis of the patient. While performing the data analysis we encountered three specific kinds of noise that we call sequence noise, human noise and duration noise. Studies in the past have discussed ways to reduce the noise in process data. However, it is not very clear what effect the noise has to different kinds of process analysis. In this paper we describe the combined effect of sequence noise, human noise and duration noise on the analysis of process data, by comparing the performance of several mining techniques on the data.

show abstract

Section: Classifier Algorithmsmentioning

confidence: 99%

Process Prediction in Noisy Data Sets: A Case Study in a Dutch Hospital

Spoel

Keulen

Amrit

2013

Lecture Notes in Business Information Processing

View full text Add to dashboard Cite

show abstract

“…These kinds of methods have gained a large acceptance in the machine learning community during the last two decades due to their high performance. Decision trees are the most common classifier structure considered and much work has been done in the topic [31,32], although they can be used with any other type of classifiers (the use of neural networks is also very extended, see for example [33]). …”

Section: Related Work On Mcssmentioning

confidence: 99%

“…The interested reader is referred to [32,33] for two reviews for the case of decision tree (both) and neural network ensembles (the latter), including exhaustive experimental studies.…”

Section: Related Work On Mcssmentioning

confidence: 99%

A Study on the Use of Multiobjective Genetic Algorithms for Classifier Selection in FURIA-based Fuzzy Multiclassifiers

Trawiński¹,

Cordón²,

Quirin³

2012

IJCIS

View full text Add to dashboard Cite

In a preceding contribution, we conducted a study considering a fuzzy multiclassifier system (MCS) design framework based on Fuzzy Unordered Rule Induction Algorithm (FURIA). It served as the fuzzy rule classification learning algorithm to derive the component classifiers considering bagging and feature selection. In this work, we integrate this approach under the overproduce-and-choose strategy. A state-ofthe-art evolutionary multiobjective algorithm, namely NSGA-II, is used to provide a component classifier selection and improve FURIA-based fuzzy MCS. We propose five different fitness functions based on three different optimization criteria, accuracy, complexity, and diversity. Twenty UCI high dimensional datasets were considered in order to conduct the experiments. A combination between accuracy and diversity criteria provided very promising results, becoming competitive with classical MCS learning methods.

show abstract

“…classifier, decision trees (DTs) are one of the most commonly used methods because they are efficient [3], [4]. Considerable work has been done to determine the effective ways for constructing diverse DTs so that the benefit of ensemble construction could be achieved.…”

mentioning

confidence: 99%

“…There are many ways, such as using different training sets and learning methods, one can adopt to construct diverse DTs. It is argued that DTs construction using different data is likely to maintain more diversity than other approaches [4]- [6] because function that a DT determines approximates from the training data. A number of methods have also been investigated to create different data sets for proper diversity.…”

mentioning

confidence: 99%

Pattern Generation through Feature Values Modification and Decision Tree Ensemble Construction

Akhand¹,

Rahman²,

Murase³

2013

IJMLC

View full text Add to dashboard Cite

Abstract-An ensemble method produces diverse classifiers and combines their decisions for ensemble's decision. A number of methods have been investigated for constructing ensemble in which some of them train classifiers with the generated patterns. This study investigates a new technique of training pattern generation that is easy and effective for ensemble construction. The method modifies feature values of some patterns with the values of other patterns to generate different patterns for different classifiers. The ensemble of decision trees based on the proposed technique was evaluated using a suite of 30 benchmark classification problems, and was found to achieve performance better than or competitive with related conventional methods. Furthermore, two different hybrid ensemble methods have been investigated incorporating the proposed technique of pattern generation with two popular ensemble methods bagging and random subspace method (RSM). It is found that the performance of bagging and RSM algorithms can be improved by incorporating feature values modification with their training processes. Experimental investigation of different types of modification techniques finds that feature values modification with pattern values in the same class is better for generalization.Index Terms-Decision tree ensemble, diversity, feature values modification, generalization, pattern generation. I. INTRODUCTIONThe goal of ensemble construction with several classifiers is to achieve better generalization ability over individual classifiers. The inspiration for building an ensemble is the same as for establishing a committee of people: each member of the committee should be as competent as possible, but the members should be complementary to one another. If the members are not complementary, i.e., if they always agree, the committee is unnecessary as any one member could perform the task of the committee. If the members are complementary, then when one or a few members make an error, there is a high probability that the remaining members can correct his error. Thus, for ensemble construction, proper diversity among classifiers (also called base classifiers) is considered to be an important parameter so that the failure of one may be compensated by others [1], [2].An ensemble method produces diverse classifiers and combines their decisions for ensemble's decision. As a base M. M. Hafizur Rahman is with Dept. of Computer Science, KICT, International Islamic University Malaysia, Jalan Gombak, 50728 Selayang, Selangor, Malaysia (e-mail: hafizur@iium.edu.my).K. Murase is with Graduate School of Engineering, University of Fukui, 3-9-1 Bunkyo, Fukui 910-8507, Japan (e-mail: murase@u-fukui.ac.jp).classifier, decision trees (DTs) are one of the most commonly used methods because they are efficient [3], [4]. Considerable work has been done to determine the effective ways for constructing diverse DTs so that the benefit of ensemble construction could be achieved. There are many ways, such as using different training sets and learning metho...

show abstract

A Comparison of Decision Tree Ensemble Creation Techniques

Cited by 386 publications

References 12 publications

Process Prediction in Noisy Data Sets: A Case Study in a Dutch Hospital

Process Prediction in Noisy Data Sets: A Case Study in a Dutch Hospital

A Study on the Use of Multiobjective Genetic Algorithms for Classifier Selection in FURIA-based Fuzzy Multiclassifiers

Pattern Generation through Feature Values Modification and Decision Tree Ensemble Construction

Contact Info

Product

Resources

About