2021
DOI: 10.1609/aaai.v35i7.16800
|View full text |Cite
|
Sign up to set email alerts
|

Parameterized Complexity of Small Decision Tree Learning

Abstract: We study the NP-hard problem of learning a decision tree (DT) of smallest depth or size from data. We provide the first parameterized complexity analysis of the problem and draw a detailed parameterized complexity map for the natural parameters: size or depth of the DT, maximum domain size of all features, and the maximum Hamming distance between any two examples. Our main result shows that learning DTs of smallest depth or size is fixed-parameter tractable (FPT) parameterized by the combination of all three … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(9 citation statements)
references
References 20 publications
0
9
0
Order By: Relevance
“…Thus, the total running time adds up to O(D 2d max nd) = O(n 2d+1 d). Before we elaborate on our next result, we first show how to improve Theorem 4 in (Ordyniak and Szeider 2021), which shows that, given an instance (E, λ, s) of DTS, and given a subset D of the dimensions, it is possible to compute in 2 O(s 2 ) |E| 1+o(1) log |E| time the smallest decision tree among all decision trees of size at most s (if they exists), that use exactly the given subset D in their cuts -none may be left out. The main idea is to first enumerate the structure of all possible decision trees of size s, before finding thresholds that work for the instance.…”
Section: Algorithmsmentioning
confidence: 99%
See 3 more Smart Citations
“…Thus, the total running time adds up to O(D 2d max nd) = O(n 2d+1 d). Before we elaborate on our next result, we first show how to improve Theorem 4 in (Ordyniak and Szeider 2021), which shows that, given an instance (E, λ, s) of DTS, and given a subset D of the dimensions, it is possible to compute in 2 O(s 2 ) |E| 1+o(1) log |E| time the smallest decision tree among all decision trees of size at most s (if they exists), that use exactly the given subset D in their cuts -none may be left out. The main idea is to first enumerate the structure of all possible decision trees of size s, before finding thresholds that work for the instance.…”
Section: Algorithmsmentioning
confidence: 99%
“…The strategy employed by Ordyniak and Szeider to solve DTS is to first find a subset D of dimensions which should be cut to find a smallest decision tree (Ordyniak and Szeider 2021). Once the set D has been determined, they use their Theorem 4 to find a smallest decision tree.…”
Section: Algorithmsmentioning
confidence: 99%
See 2 more Smart Citations
“…This complexity barrier motivates the study of the problem under the parameterized complexity paradigm. Ordyniak and Szeider (2021) made the first approach in this direction, parameterizing the problem by solution size in terms of the number of nodes or the depth of the computed DT. In this paper, we parameterize the problem to exploit the hidden structure of the given CI E. We capture the hidden structure of E in terms of small rank-width of the incidence graph, which is the bipartite graph G(E) whose vertices are the examples in one part and the features in the other, where an example e is adjacent to a feature f if and only if e(f ) = 1.…”
Section: Introductionmentioning
confidence: 99%