2020
DOI: 10.1016/j.ejor.2019.12.002
|View full text |Cite
|
Sign up to set email alerts
|

Sparsity in optimal randomized classification trees

Abstract: Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging.In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better learning performance, especially when oblique cuts are implemented. In this paper, we propose a continuous optimiza… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
42
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 40 publications
(43 citation statements)
references
References 32 publications
1
42
0
Order By: Relevance
“…The proposed model takes into account the trade-off between accuracy and the simplicity of the chosen rules and is solved via a column generation method. (Blanquero et al 2018a;2018b) use a continuous optimization formulation to learn classification trees, where random decisions are made at internal nodes of the tree. Their approach is essentially a randomized optimal version of CART.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed model takes into account the trade-off between accuracy and the simplicity of the chosen rules and is solved via a column generation method. (Blanquero et al 2018a;2018b) use a continuous optimization formulation to learn classification trees, where random decisions are made at internal nodes of the tree. Their approach is essentially a randomized optimal version of CART.…”
Section: Related Workmentioning
confidence: 99%
“…Hence, greedy based heuristics such as CART (Breiman et al 1984) and ID3 (Quinlan 1986) have been widely used to construct sub-optimal trees. Recent years have seen an increasing number of work that employ various Mathematical Optimization methods to build better quality decision trees, e.g., (Bennett and Blue 1996;Bessiere, Hebrard, and O'Sullivan 2009;Bertsimas and Dunn 2017;Silva 2017;Dash, Günlük, and Wei 2018;Blanquero et al 2018a;2018b;Firat et al 2018).…”
Section: Introductionmentioning
confidence: 99%
“…1 b. Many different approaches have been undertaken to implement more optimal DTs 9 , 11 14 , 16 , 18 , 19 . The computational complexity of training non-greedy DTs however grows exponentially with the number of nodes, as opposed to linearly for greedy ones.…”
Section: Methodsmentioning
confidence: 99%
“…The concern that GDTs are suboptimal was addressed long ago 9 . The problem of constructing a globally optimal DT is NP-hard 10 , Hence, various optimization techniques, relying on linear programming 9 , 11 , 12 , stochastic gradient descent 13 , mixed-integer formulation 14 , anytime induction 15 , randomization 16 , multilayer cascade structures 17 , column generation techniques 18 , and genetic algorithms 19 , have been proposed to solve this problem. All of these methods seek to strike a balance between accuracy, simplicity and efficiency.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, also integer linear programming has been employed to determine global optimal univariate and oblique decision trees of a prior specified maximum size [17,18]. Blanquero et al [19] develop a continuous optimization formulation instead to determine optimal randomized oblique decision trees. Additionally, penalty terms are introduced in the objective function to limit the number of overall involved attributes and the number of attributes per split to improve interpretability.…”
Section: Related Work and Contributionmentioning
confidence: 99%