1998
DOI: 10.1007/bfb0033269
|View full text |Cite
|
Sign up to set email alerts
|

Multi-interval discretization methods for decision tree learning

Abstract: Abstract. Properly addressing the discretization process of continuos valued features is an important problem dtu~g decision tree learning. This paper describes four multi-interval discretization methods for induction of decision trees used in dynamic fashion. We compare two known discretization methods to two new methods proposed in this paper based on a histogram based method and a neural net based method (LVQ). We compare them according to accuracy of the resulting decision tree and to compactness of the tr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
11
0

Year Published

2001
2001
2015
2015

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 31 publications
(11 citation statements)
references
References 7 publications
0
11
0
Order By: Relevance
“…Based on that data set we acquired the knowledge for classification. We used a binary [5] and n-ary decision tree induction algorithm [6] realized in our data mining tool DECISIONMASTER [7]. The n-ary decision tree can split up a numerical feature into more than two intervals which leads sometimes to a better performance than the one of a binary decision tree.…”
Section: Learning Of Classifier Knowledgementioning
confidence: 99%
“…Based on that data set we acquired the knowledge for classification. We used a binary [5] and n-ary decision tree induction algorithm [6] realized in our data mining tool DECISIONMASTER [7]. The n-ary decision tree can split up a numerical feature into more than two intervals which leads sometimes to a better performance than the one of a binary decision tree.…”
Section: Learning Of Classifier Knowledgementioning
confidence: 99%
“…Although EMD has demonstrated strong performance for naive-Bayes (Dougherty et al 1995;Perner and Trautzsch 1998), it was developed in the context of top-down induction of decision trees. It uses MDL as the termination condition.…”
Section: Entropy Minimization Discretizationmentioning
confidence: 99%
“…Discretization of continuous attributes is fundamental to many decision tree algorithms and is therefore a well researched area in data mining [26]. Many decision tree algorithms such as ID3, C4.5 and CART all require binary splits at decision nodes [27]. The value at which this split occurs is usually determined by a discretization algorithm, although the difference here is that this discretization will occur in a dynamic manner as the tree is built, rather than occurring as a pre-processing step as occurs in nearest neighbour algorithms.…”
Section: B Attribute Discretizationmentioning
confidence: 99%
“…For feature A, the boundary T min that minimises the entropy over all possible boundaries is selected [27]. The application of this will therefore result in a binary split, and the method can be applied recursively until a stopping criterion is met, in this case, a criterion based on the Minimum Description Length Principle.…”
Section: B Attribute Discretizationmentioning
confidence: 99%