Proceedings of the 2006 SIAM International Conference on Data Mining 2006
DOI: 10.1137/1.9781611972764.35
|View full text |Cite
|
Sign up to set email alerts
|

Item Sets That Compress

Abstract: One of the major problems in frequent item set mining is the explosion of the number of results: it is difficult to find the most interesting frequent item sets. The cause of this explosion is that large sets of frequent item sets describe essentially the same set of transactions. In this paper we approach this problem using the MDL principle: the best set of frequent item sets is that set that compresses the database best. We introduce four heuristic algorithms for this task, and the experiments show that the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

2
161
0

Year Published

2006
2006
2012
2012

Publication Types

Select...
6
2

Relationship

3
5

Authors

Journals

citations
Cited by 125 publications
(163 citation statements)
references
References 13 publications
2
161
0
Order By: Relevance
“…In [15] we defined the optimal set of (frequent) item sets as that one whose associated code table minimises the total compressed size:…”
Section: J∈ct F Req(j)mentioning
confidence: 99%
See 2 more Smart Citations
“…In [15] we defined the optimal set of (frequent) item sets as that one whose associated code table minimises the total compressed size:…”
Section: J∈ct F Req(j)mentioning
confidence: 99%
“…Rather, in this paper we study the problem for one specific class of models, viz., the code tables induced by our Krimp algorithm [15]. Given all frequent item sets on a table, Krimp selects a small subset of these frequent item sets.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In Siebes et al (2006) we introduced the krimp algorithm. 1 This MDL-based algorithm picks a few descriptive frequent item sets that compress the data well.…”
mentioning
confidence: 99%
“…Keogh et al [17] developed a simple and effective scheme for mining time-series data through compression. Actually, compression or Minimum Description Language (MDL) have become the workhorse of many parameter-free algorithms: frequent itemsets [24], biclustering [4,23], time-evolving graph clustering [25], and spatial-clustering [20].…”
Section: Related Workmentioning
confidence: 99%