Proceedings of the Sixteenth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems 1997
DOI: 10.1145/263661.263684
|View full text |Cite
|
Sign up to set email alerts
|

Data mining, hypergraph transversals, and machine learning (extended abstract)

Abstract: Several data mining problems can be formulated as problems of finding maximally specific sentences that are interesting in a database. We first show that this problem has a close relationship with the hypergraph transversal problem. We then analyze two algorithms that have been previously used in data mining, proving upper bounds on their complexity. The first algorithm is useful when the maximally specific interesting sentences are "small". We show that this algorithm can also be used to efficiently solve a s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
108
0
5

Year Published

2001
2001
2022
2022

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 157 publications
(113 citation statements)
references
References 20 publications
0
108
0
5
Order By: Relevance
“…When V is a finite set of points and each object in F is an arbitrary finite subset of V, we obtain the well-known hypergraph transversal or dualization problem [2], which calls for finding all minimal hitting sets for a given hypergraph G ⊆ 2 V , defined on a finite set of vertices V. Denote by Tr(G) the set of all minimal hitting sets of G, also known as the transversal hypergraph of G. The problem of finding Tr(G) has received considerable attention in the literature (see, e.g., [3,12,13,19,29,31]), since it is known to be polynomially or quasi-polynomially equivalent with many problems in various areas, such as artificial intelligence (e.g., [12,24]), database theory (e.g., [30]), distributed systems (e.g., [23]), machine learning and data mining (e.g., [1,7,20]), mathematical programming (e.g., [5,25]), matroid theory (e.g., [26]), and reliability theory (e.g., [9]). …”
Section: Introductionmentioning
confidence: 99%
“…When V is a finite set of points and each object in F is an arbitrary finite subset of V, we obtain the well-known hypergraph transversal or dualization problem [2], which calls for finding all minimal hitting sets for a given hypergraph G ⊆ 2 V , defined on a finite set of vertices V. Denote by Tr(G) the set of all minimal hitting sets of G, also known as the transversal hypergraph of G. The problem of finding Tr(G) has received considerable attention in the literature (see, e.g., [3,12,13,19,29,31]), since it is known to be polynomially or quasi-polynomially equivalent with many problems in various areas, such as artificial intelligence (e.g., [12,24]), database theory (e.g., [30]), distributed systems (e.g., [23]), machine learning and data mining (e.g., [1,7,20]), mathematical programming (e.g., [5,25]), matroid theory (e.g., [26]), and reliability theory (e.g., [9]). …”
Section: Introductionmentioning
confidence: 99%
“…As a first approach, maximal frequent patterns [9] and closed frequent patterns [18] have been introduced. These subsets of frequent patterns are more concise and allow to derive all the frequent patterns.…”
Section: Pattern Summarizationmentioning
confidence: 99%
“…However, global approaches more genuinely belong to Machine Learning, where the assumption of a global generative process behind the data is better supported. Most prominent example of work in the local tradition is the research on association rule mining [13,4,1,15]. We shall be concerned in this paper with the local problem.…”
Section: Local and Global Issues In Data Miningmentioning
confidence: 99%
“…A direct application of computational models developed for Machine Learning is therefore not possible, for at least two reasons. First, a Data Mining task may consist of finding many weak predictors in a hypothesis space whereas in Machine Learning one strong predictor is normally sought [4]. This is the so-called local-global problem [13].…”
Section: Introductionmentioning
confidence: 99%