Strong Optimal Classification Trees

Aghaei, Sina; Gómez, Andrés; Vayanos, Phebe

doi:10.48550/arxiv.2103.15965

Cited by 5 publications

(14 citation statements)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Building upon SOCTs [26,27], the decision tree problem is formulated in a max-flow based model for computational efficiency.…”

Section: Improved Strong Optimal Classification Trees (Isocts)mentioning

confidence: 99%

“…The SOCT model [26,27], especially its branching constraints, is constructed for binary features. These constraints are as follows:…”

Section: Branching Threshold Constraints For Continuous Featuresmentioning

confidence: 99%

“…Constraints (24) indicate that data sample j can be counted as at most one in the data flow. Constraints (25) ensure that if data sample j arrives at a branch node n, then it must go to its leftor right-child node, or the sink node S. Constraints (26) ensure that if data sample j arrives at a terminal node n, then it must go to the sink node S. Constraints (27) require that only when data sample j is correctly classified to the category at node n under contingency c, then it can flow into the sink node S through node n. Constraints (28) ensure that there are at least N min data samples reaching each leaf node n.…”

Section: Remaining Constraintsmentioning

confidence: 99%

“…with a loose linear programming relaxation and thus time-consuming to prove the optimality). To avoid the use of big-M constraints for computational efficiency, strong optimal classification trees (SOCTs) were developed [26,27] to transform the OCT problem into a max-flow based model from a source node to a sink node. However, SOCT only considered binary input features, while features of the corrective SCED problem, such as net load values, are continuous.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Interpretable data‐driven contingency classification for real‐time corrective security‐constrained economic dispatch

Yu,

Gao,

et al. 2023

IET Renewable Power Gen

View full text Add to dashboard Cite

High penetrations of renewable energy are crucial for low‐carbon power systems. However, the higher volatility of renewable power generation pushes real‐time operations closer to equipment limits. It is thus important to utilize flexibilities in the system through corrective security‐constrained economic dispatch (SCED) that allows generators to take corrective adjustments after contingencies. The corrective SCED problem, containing a large number of contingencies, and corresponding post‐contingency decisions and constraints, is very large in scale and difficult to solve using purely model‐based methods within the strict time limits of real‐time markets. To accelerate the solution process, this paper develops a novel interpretable data‐driven contingency classification method. Historical data and their potentially useful patterns are utilized in interpretable data‐driven decision tree classifiers. To directly consider continuous features, such as net load values, and to consider imbalanced datasets without much additional complexity, Improved Strong Optimal Classification Trees (ISOCTs) are developed with new branching threshold constraints and category weights in the objective function. ISOCTs are then embedded into a hybrid model‐based and data‐driven framework to guarantee the accuracy of the real‐time active contingency set and the resulting security of dispatch decisions. Numerical testing results demonstrate the classification accuracy, computational efficiency, and interpretability of the proposed approach.

show abstract

“…Building upon SOCTs [26,27], the decision tree problem is formulated in a max-flow based model for computational efficiency.…”

Section: Improved Strong Optimal Classification Trees (Isocts)mentioning

confidence: 99%

“…The SOCT model [26,27], especially its branching constraints, is constructed for binary features. These constraints are as follows:…”

Section: Branching Threshold Constraints For Continuous Featuresmentioning

confidence: 99%

Section: Remaining Constraintsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Interpretable data‐driven contingency classification for real‐time corrective security‐constrained economic dispatch

Yu,

Gao,

et al. 2023

IET Renewable Power Gen

View full text Add to dashboard Cite

show abstract

“…Recently, many works have directly optimized a performance metric (e.g., accuracy) with soft or hard sparsity constraints on the tree size. Such decision tree optimization problems can be formulated using mixed integer programming (MIP) (Bertsimas and Dunn 2017;Verwer and Zhang 2019;Vilas Boas et al 2021;Günlük et al 2021;Rudin and Ertekin 2018;Aghaei, Gómez, and Vayanos 2021). Other approaches use SAT solvers to find optimal decision trees (Narodytska et al 2018;Hu et al 2020), though these techniques require data to be perfectly separable, which is not typical for machine learning.…”

Section: Related Workmentioning

confidence: 99%

Fast Sparse Decision Tree Optimization via Reference Ensembles

McTavish

Zhong

Achermann

et al. 2022

AAAI

View full text Add to dashboard Cite

Sparse decision tree optimization has been one of the most fundamental problems in AI since its inception and is a challenge at the core of interpretable machine learning. Sparse decision tree optimization is computationally hard, and despite steady effort since the 1960's, breakthroughs have been made on the problem only within the past few years, primarily on the problem of finding optimal sparse decision trees. However, current state-of-the-art algorithms often require impractical amounts of computation time and memory to find optimal or near-optimal trees for some real-world datasets, particularly those having several continuous-valued features. Given that the search spaces of these decision tree optimization problems are massive, can we practically hope to find a sparse decision tree that competes in accuracy with a black box machine learning model? We address this problem via smart guessing strategies that can be applied to any optimal branch-and-bound-based decision tree algorithm. The guesses come from knowledge gleaned from black box models. We show that by using these guesses, we can reduce the run time by multiple orders of magnitude while providing bounds on how far the resulting trees can deviate from the black box's accuracy and expressive power. Our approach enables guesses about how to bin continuous features, the size of the tree, and lower bounds on the error for the optimal decision tree. Our experiments show that in many cases we can rapidly construct sparse decision trees that match the accuracy of black box models. To summarize: when you are having trouble optimizing, just guess.

show abstract

Scalable Optimal Multiway-Split Decision Trees with Constraints

Subramanian¹,

Sun²

2023

AAAI

View full text Add to dashboard Cite

There has been a surge of interest in learning optimal decision trees using mixed-integer programs (MIP) in recent years, as heuristic-based methods do not guarantee optimality and find it challenging to incorporate constraints that are critical for many practical applications. However, existing MIP methods that build on an arc-based formulation do not scale well as the number of binary variables is in the order of 2 to the power of the depth of the tree and the size of the dataset. Moreover, they can only handle sample-level constraints and linear metrics. In this paper, we propose a novel path-based MIP formulation where the number of decision variables is independent of dataset size. We present a scalable column generation framework to solve the MIP. Our framework produces a multiway-split tree which is more interpretable than the typical binary-split trees due to its shorter rules. Our framework is more general as it can handle nonlinear metrics such as F1 score, and incorporate a broader class of constraints. We demonstrate its efficacy with extensive experiments. We present results on datasets containing up to 1,008,372 samples while existing MIP-based decision tree models do not scale well on data beyond a few thousand points. We report superior or competitive results compared to the state-of-art MIP-based methods with up to a 24X reduction in runtime.

show abstract

Strong Optimal Classification Trees

Cited by 5 publications

References 43 publications

Interpretable data‐driven contingency classification for real‐time corrective security‐constrained economic dispatch

Interpretable data‐driven contingency classification for real‐time corrective security‐constrained economic dispatch

Fast Sparse Decision Tree Optimization via Reference Ensembles

Scalable Optimal Multiway-Split Decision Trees with Constraints

Contact Info

Product

Resources

About