Parameterized Complexity of Small Decision Tree Learning

Ordyniak, Sebastian; Szeider, Stefan

doi:10.1609/aaai.v35i7.16800

Cited by 4 publications

(9 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Thus, the total running time adds up to O(D 2d max nd) = O(n 2d+1 d). Before we elaborate on our next result, we first show how to improve Theorem 4 in (Ordyniak and Szeider 2021), which shows that, given an instance (E, λ, s) of DTS, and given a subset D of the dimensions, it is possible to compute in 2 O(s 2 ) |E| 1+o(1) log |E| time the smallest decision tree among all decision trees of size at most s (if they exists), that use exactly the given subset D in their cuts -none may be left out. The main idea is to first enumerate the structure of all possible decision trees of size s, before finding thresholds that work for the instance.…”

Section: Algorithmsmentioning

confidence: 99%

“…The strategy employed by Ordyniak and Szeider to solve DTS is to first find a subset D of dimensions which should be cut to find a smallest decision tree (Ordyniak and Szeider 2021). Once the set D has been determined, they use their Theorem 4 to find a smallest decision tree.…”

Section: Algorithmsmentioning

confidence: 99%

“…The classical CART heuristic herein is among the Top 10 Algorithms of Data Mining chosen by the ICDM (Wu et al 2008;Steinberg 2009) and several implementations are based on exact algorithms minimizing the size of the produced trees. Despite this, our knowledge of the computational complexity of learning (minimum-node) decision trees is limited: Several classical results show NPhardness (Hyafil and Rivest 1976;Goodrich et al 1995) (see also the survey by Murthy (1998)) and we know that even if we require parameters such as the number of nodes of the tree, or the number of different feature values, to be small, we still cannot achieve efficient algorithms in terms of upper bounds on the running time (Ordyniak and Szeider 2021).…”

Section: Introductionmentioning

confidence: 99%

“…In contrast, we show that in the small-dimension regime we do obtain a prospect for an efficient algorithm with running time O((s 3 d) s • n 1+o( 1) ) (Theorem 3.2). An intermediate result towards this is inspired by and improves upon an algorithm by Ordyniak and Szeider (2021) for determining a smallest decision tree that cuts only a given set of features; we decrease the running time from 1) for our purpose.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

The Influence of Dimensions on the Complexity of Computing Decision Trees

Kobourov

Löffler

Fabrizio

et al. 2023

AAAI

View full text Add to dashboard Cite

A decision tree recursively splits a feature space \mathbb{R}^d and then assigns class labels based on the resulting partition. Decision trees have been part of the basic machine-learning toolkit for decades. A large body of work considers heuristic algorithms that compute a decision tree from training data, usually aiming to minimize in particular the size of the resulting tree. In contrast, little is known about the complexity of the underlying computational problem of computing a minimum-size tree for the given training data. We study this problem with respect to the number d of dimensions of the feature space \mathbb{R}^d, which contains n training examples. We show that it can be solved in O(n^(2d + 1)) time, but under reasonable complexity-theoretic assumptions it is not possible to achieve f(d) * n^o(d / log d) running time. The problem is solvable in (dR)^O(dR) * n^(1+o(1)) time, if there are exactly two classes and R is an upper bound on the number of tree leaves labeled with the first class.

show abstract

Section: Algorithmsmentioning

confidence: 99%

Section: Algorithmsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

The Influence of Dimensions on the Complexity of Computing Decision Trees

Kobourov

Löffler

Fabrizio

et al. 2023

AAAI

View full text Add to dashboard Cite

show abstract

“…This complexity barrier motivates the study of the problem under the parameterized complexity paradigm. Ordyniak and Szeider (2021) made the first approach in this direction, parameterizing the problem by solution size in terms of the number of nodes or the depth of the computed DT. In this paper, we parameterize the problem to exploit the hidden structure of the given CI E. We capture the hidden structure of E in terms of small rank-width of the incidence graph, which is the bipartite graph G(E) whose vertices are the examples in one part and the features in the other, where an example e is adjacent to a feature f if and only if e(f ) = 1.…”

Section: Introductionmentioning

confidence: 99%

Learning Small Decision Trees for Data of Low Rank-Width

Dabrowski,

Eiben,

Ordyniak

et al. 2024

AAAI

View full text Add to dashboard Cite

We consider the NP-hard problem of finding a smallest decision tree representing a classification instance in terms of a partially defined Boolean function. Small decision trees are desirable to provide an interpretable model for the given data. We show that the problem is fixed-parameter tractable when parameterized by the rank-width of the incidence graph of the given classification instance. Our algorithm proceeds by dynamic programming using an NLC decomposition obtained from a rank-width decomposition. The key to the algorithm is a succinct representation of partial solutions. This allows us to limit the space and time requirements for each dynamic programming step in terms of the parameter.

show abstract

Logic-Based Explainability in Machine Learning

Marques-Silva

2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

In recent years, the impact of machine learning (ML) and artificial intelligence (AI) in society has been absolutely remarkable. This impact is expected to continue in the foreseeable future. However, the adoption of AI/ML is also a cause of grave concern. The operation of the most advances AI/ML models is often beyond the grasp of human decision makers. As a result, decisions that impact humans may not be understood and may lack rigorous validation. Explainable AI (XAI) is concerned with providing human decision-makers with understandable explanations for the predictions made by ML models. As a result, XAI is a cornerstone of trustworthy AI. Despite its strategic importance, most work on XAI lacks rigor, and so its use in high-risk or safety-critical domains serves to foster distrust instead of contributing to build muchneeded trust. Logic-based XAI has recently emerged as a rigorous alternative to those other non-rigorous methods of XAI. This paper provides a technical survey of logic-based XAI, its origins, the current topics of research, and emerging future topics of research. The paper also highlights the many myths that pervade non-rigorous approaches for XAI.

show abstract

Parameterized Complexity of Small Decision Tree Learning

Cited by 4 publications

References 20 publications

The Influence of Dimensions on the Complexity of Computing Decision Trees

The Influence of Dimensions on the Complexity of Computing Decision Trees

Learning Small Decision Trees for Data of Low Rank-Width

Logic-Based Explainability in Machine Learning

Contact Info

Product

Resources

About