Simplifying decision trees: A survey

Breslow, Leonard A.; Aha, David W.

doi:10.1017/s0269888997000015

Cited by 243 publications

(141 citation statements)

References 93 publications

Supporting

Mentioning

139

Contrasting

Unclassified

Order By: Relevance

“…There are a number of reasons why the use of these techniques from the pattern recognition literature is appropriate in our context: (i) they are input ⁄ output modeling techniques that build the model exclusively from data (data-driven); (ii) they are particularly well suited for scenarios where the system cannot be completely characterized by first principle models, while empirical evidence suggests that the input ⁄ output system behavior is nonlinear; (iii) they are capable of generating possibly nonlinear classification borders between two or more classes; (iv) classification functions of the kind that maps actigraphy-derived variables to PS scores can, in theory, be approximated with arbitrary precision using neural network models of appropriate complexity (Ôuniversal approximation theorem for neural networksÕ; Lloyd, 2003). Similarly, decision trees with sufficiently many nodes and leaves can provide comparable accuracy (Breslow and Aha, 1997).…”

Section: Pattern Recognition Methodsmentioning

confidence: 99%

Algorithms for sleep–wake identification using actigraphy: a comparative study and new results

Tilmanne

Urbain

Kothare

et al. 2009

Journal of Sleep Research

View full text Add to dashboard Cite

SUMMAR Y The aim of this study was to investigate two new scoring algorithms employing artificial neural networks and decision trees for distinguishing sleep and wake states in infants using actigraphy and to validate and compare the performance of the proposed algorithms with known actigraphy scoring algorithms. The study employed previously recorded longitudinal physiological infant data set from the Collaborative Home Infant Monitoring Evaluation (CHIME) study conducted between 1994 and 1998 [http:// dccwww.bumc.bu.edu/ChimeNisp/Main_Chime.asp; Sleep 26 (1997) 553] at five clinical sites around the USA. The original CHIME data set contains recordings of 1079 infants <1 year old. In our study, we used the overnight polysomnography scored data and ankle actimeter (Alice 3) raw data for 354 infants from this data set. The participants were heterogeneous and grouped into four categories: healthy term, preterm, siblings of SIDS and infants with apparent life-threatening events (apnea of infancy). The selection of the most discriminant actigraphy features was carried out using FisherÕs discriminant analysis. Approximately 80% of all the epochs were used to train the artificial neural network and decision tree models. The models were then validated on the remaining 20% of the epochs. The use of artificial neural networks and decision trees was able to capture potentially nonlinear classification characteristics, when compared to the previously reported linear combination methods and hence showed improved performance. The quality of sleep-wake scoring was further improved by including more wake epochs in the training phase and by employing rescoring rules to remove artifacts. The large size of the database (approximately 337 000 epochs for 354 patients) provided a solid basis for determining the efficacy of actigraphy in sleep scoring. The study also suggested that artificial neural networks and decision trees could be much more routinely utilized in the context of clinical sleep search.

show abstract

Section: Pattern Recognition Methodsmentioning

confidence: 99%

Algorithms for sleep–wake identification using actigraphy: a comparative study and new results

Tilmanne

Urbain

Kothare

et al. 2009

Journal of Sleep Research

View full text Add to dashboard Cite

show abstract

“…The literature on such algorithms is quite mature [22], [11], [4]. Attribute selection and tree pruning are two key techniques used to choose the right prediction attributes that best separate a given dataset into individual subspaces and build up a reliable regression model in each subspace [23], [34], [6], [7]. Our work is different in that it addresses cost concerns when data must be collected and processed over a resource-scarce environment such as a sensor network.…”

Section: Related Workmentioning

confidence: 99%

Optimizing quality-of-information in cost-sensitive sensor data fusion

Wang

Ahmadi

Abdelzaher

et al. 2011

2011 International Conference on Distributed Computing in Sensor Systems and Workshops (DCOSS)

View full text Add to dashboard Cite

Abstract-This paper investigates maximizing quality of information subject to cost constraints in data fusion systems. We consider data fusion applications that try to estimate or predict some current or future state of a complex physical world. Examples include target tracking, path planning, and sensor node localization. Rather than optimizing generic network-level metrics such as latency or throughput, we achieve more resourceefficient sensor network operation by directly optimizing an application-level notion of quality, namely prediction error. This is done while accommodating cost constraints. Unlike prior costsensitive prediction/regression schemes, our solution considers more complex prediction problems that arise in sensor networks where phenomena behave differently under different conditions, and where both ordered and unordered prediction attributes are used. The scheme is evaluated through real sensor network applications in localization and path planning. Experimental results show that non-trivial cost savings can be achieved by our scheme compared to popular cost-insensitive schemes, and a significantly better prediction error can be achieved compared to the cost-sensitive linear regression schemes.

show abstract

“…The optimal tree is prescribed as the one that yields the lowest prediction error among the resulting subtrees based on classifying another independent test sample using each of these subtrees. Breiman et al (1984) and Breslow and Aha (1997) describe various other (mostly heuristic) pruning methods that can be used in this context, but by far, the most widely used technique is the aforementioned CCP method, which has been implemented in many other systems such as Splus and OC1 as well (see Breiman et al 1984).…”

Section: Introductionmentioning

confidence: 99%

“…The problem is formulated as a parametric bilinear program, which is NP-hard, and a heuristic based on the Frank-Wolfe method is adopted to solve this problem. For further reading and surveys on constructing decision trees, we refer the reader to Breslow and Aha (1997), Li et al (2001), Murthy (1997Murthy ( , 1998, and Safavin and Landgrebe (1991).…”

Section: Introductionmentioning

confidence: 99%

An Optimal Constrained Pruning Strategy for Decision Trees

Sherali

Hobeika

Jeenanunta³

2009

INFORMS Journal on Computing

View full text Add to dashboard Cite

T his paper is concerned with the optimal constrained pruning of decision trees. We present a novel 0-1 programming model for pruning the tree to minimize some general penalty function based on the resulting leaf nodes, and show that this model possesses a totally unimodular structure that enables it to be solved as a shortest-path problem on an acyclic graph. Moreover, we prove that this problem can be solved in strongly polynomial time while incorporating an additional constraint on the number of residual leaf nodes. Furthermore, the framework of the proposed modeling approach renders it suitable to accommodate different (multiple) objective functions and side-constraints, and we identify various such modeling options that can be applied in practice. The developed methodology is illustrated using a numerical example to provide insights, and some computational results are presented to demonstrate the efficacy of solving generically constrained problems of this type. We also apply this technique to a large-scale transportation analysis and simulation system (TRANSIMS), and present related computational results using real data to exhibit the flexibility and effectiveness of the proposed approach.

show abstract

Simplifying decision trees: A survey

Cited by 243 publications

References 93 publications

Algorithms for sleep–wake identification using actigraphy: a comparative study and new results

Algorithms for sleep–wake identification using actigraphy: a comparative study and new results

Optimizing quality-of-information in cost-sensitive sensor data fusion

An Optimal Constrained Pruning Strategy for Decision Trees

Contact Info

Product

Resources

About