2006
DOI: 10.1007/s10796-006-8779-8
|View full text |Cite
|
Sign up to set email alerts
|

Splitting methods for decision tree induction: An exploration of the relative performance of two entropy-based families

Abstract: Decision tree (DT) induction is among the more popular of the data mining techniques. An important component of DT induction algorithms is the splitting method, with the most commonly used method being based on the Conditional Entropy (CE) family. However, it is well known that there is no single splitting method that will give the best performance for all problem instances. In this paper we explore the relative performance of the Conditional Entropy family and another family that is based on the Class-Attribu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2007
2007
2024
2024

Publication Types

Select...
6

Relationship

1
5

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…For KNN and decision tree, the best value of parameter k and cp were decided through attempting different values. In addition, the performance of two split methods, information gain and Gini impurity were also investigated for decision tree [101]. The value corresponding to the highest accuracy (10-fold cross validation with 3 times repetition) was selected as the optimal values for the parameters.…”
Section: Classificationmentioning
confidence: 99%
“…For KNN and decision tree, the best value of parameter k and cp were decided through attempting different values. In addition, the performance of two split methods, information gain and Gini impurity were also investigated for decision tree [101]. The value corresponding to the highest accuracy (10-fold cross validation with 3 times repetition) was selected as the optimal values for the parameters.…”
Section: Classificationmentioning
confidence: 99%
“…However, most of these investigations have focussed on classification rather than regression trees, where the popular alternatives are chi-squared, entropy and posterior improvement criteria (Taylor and Silverman 1993;Shi 1999;Osei-Bryson 2006). One study that does include regression trees is that by Bremner and Ross (2002), which reports greater success in modelling data that include interaction effects by local averaging of residual sum of squares, but for our purposes none of these investigations suggests a change of focus from the W criterion.…”
Section: Introductionmentioning
confidence: 99%