Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2007
DOI: 10.1145/1281192.1281232
|View full text |Cite
|
Sign up to set email alerts
|

Finding low-entropy sets and trees from binary data

Abstract: The discovery of subsets with special properties from binary data has been one of the key themes in pattern discovery. Pattern classes such as frequent itemsets stress the co-occurrence of the value 1 in the data. While this choice makes sense in the context of sparse binary data, it disregards potentially interesting subsets of attributes that have some other type of dependency structure.We consider the problem of finding all subsets of attributes that have low complexity. The complexity is measured by either… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
45
0

Year Published

2009
2009
2019
2019

Publication Types

Select...
6
1
1

Relationship

0
8

Authors

Journals

citations
Cited by 32 publications
(45 citation statements)
references
References 27 publications
0
45
0
Order By: Relevance
“…Continuing the above line of research, Heikinheimo et al define two related problems, namely, mining high-and lowentropy sets [5]. Zhang and Masseglia [6] extended their method to work on streaming data and proposed to reduce its output by removing similar sets according to criteria based on mutual information [20].…”
Section: Entropy-based Measures Of Itemset Interestingnessmentioning
confidence: 99%
“…Continuing the above line of research, Heikinheimo et al define two related problems, namely, mining high-and lowentropy sets [5]. Zhang and Masseglia [6] extended their method to work on streaming data and proposed to reduce its output by removing similar sets according to criteria based on mutual information [20].…”
Section: Entropy-based Measures Of Itemset Interestingnessmentioning
confidence: 99%
“…In other words, in each iteration the algorithm tries to reduce the total description length as much as possible. If a merge reduces the lowest description length seen yet, we remember it (6-7), and finally return the best clustering (10).…”
Section: Mining Attribute Clusteringsmentioning
confidence: 99%
“…Most related to our method are low-entropy sets [10], itemsets for which the entropy of the data is below a given threshold. As entropy is strongly monotonically increasing, typically very many low-entropy sets are discovered even for low thresholds.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Ref. [12] proposed to find those low-entropy sets, and introduced two low entropy trees. They discussed properties of their trees and proposed some mining algorithms.…”
Section: Related Workmentioning
confidence: 99%