1997
DOI: 10.1017/s0269888997000015
|View full text |Cite
|
Sign up to set email alerts
|

Simplifying decision trees: A survey

Abstract: Induced decision trees are an extensively-researched solution to classification tasks. For many practical tasks, the trees produced by tree-generation algorithms are not comprehensible to users due to their size and complexity. Although many tree induction algorithms have been shown to produce simpler, more comprehensible trees (or data structures derived from trees) with good classification accuracy, tree simplification has usually been of secondary concern relative to accuracy, and no attempt has been made … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
139
0
2

Year Published

2003
2003
2021
2021

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 243 publications
(141 citation statements)
references
References 93 publications
0
139
0
2
Order By: Relevance
“…There are a number of reasons why the use of these techniques from the pattern recognition literature is appropriate in our context: (i) they are input ⁄ output modeling techniques that build the model exclusively from data (data-driven); (ii) they are particularly well suited for scenarios where the system cannot be completely characterized by first principle models, while empirical evidence suggests that the input ⁄ output system behavior is nonlinear; (iii) they are capable of generating possibly nonlinear classification borders between two or more classes; (iv) classification functions of the kind that maps actigraphy-derived variables to PS scores can, in theory, be approximated with arbitrary precision using neural network models of appropriate complexity (Ôuniversal approximation theorem for neural networksÕ; Lloyd, 2003). Similarly, decision trees with sufficiently many nodes and leaves can provide comparable accuracy (Breslow and Aha, 1997).…”
Section: Pattern Recognition Methodsmentioning
confidence: 99%
“…There are a number of reasons why the use of these techniques from the pattern recognition literature is appropriate in our context: (i) they are input ⁄ output modeling techniques that build the model exclusively from data (data-driven); (ii) they are particularly well suited for scenarios where the system cannot be completely characterized by first principle models, while empirical evidence suggests that the input ⁄ output system behavior is nonlinear; (iii) they are capable of generating possibly nonlinear classification borders between two or more classes; (iv) classification functions of the kind that maps actigraphy-derived variables to PS scores can, in theory, be approximated with arbitrary precision using neural network models of appropriate complexity (Ôuniversal approximation theorem for neural networksÕ; Lloyd, 2003). Similarly, decision trees with sufficiently many nodes and leaves can provide comparable accuracy (Breslow and Aha, 1997).…”
Section: Pattern Recognition Methodsmentioning
confidence: 99%
“…The literature on such algorithms is quite mature [22], [11], [4]. Attribute selection and tree pruning are two key techniques used to choose the right prediction attributes that best separate a given dataset into individual subspaces and build up a reliable regression model in each subspace [23], [34], [6], [7]. Our work is different in that it addresses cost concerns when data must be collected and processed over a resource-scarce environment such as a sensor network.…”
Section: Related Workmentioning
confidence: 99%
“…The optimal tree is prescribed as the one that yields the lowest prediction error among the resulting subtrees based on classifying another independent test sample using each of these subtrees. Breiman et al (1984) and Breslow and Aha (1997) describe various other (mostly heuristic) pruning methods that can be used in this context, but by far, the most widely used technique is the aforementioned CCP method, which has been implemented in many other systems such as Splus and OC1 as well (see Breiman et al 1984).…”
Section: Introductionmentioning
confidence: 99%
“…The problem is formulated as a parametric bilinear program, which is NP-hard, and a heuristic based on the Frank-Wolfe method is adopted to solve this problem. For further reading and surveys on constructing decision trees, we refer the reader to Breslow and Aha (1997), Li et al (2001), Murthy (1997Murthy ( , 1998, and Safavin and Landgrebe (1991).…”
Section: Introductionmentioning
confidence: 99%