Proceedings of the 2017 SIAM International Conference on Data Mining 2017
DOI: 10.1137/1.9781611974973.8
|View full text |Cite
|
Sign up to set email alerts
|

ALPINE: Progressive Itemset Mining with Definite Guarantees

Abstract: With increasing demand for efficient data analysis, execution time of itemset mining becomes critical for many large-scale or time-sensitive applications. We propose a dynamic approach for itemset mining that allows us to achieve flexible trade-offs between efficiency and completeness. ALPINE is to our knowledge the first algorithm to progressively mine itemsets and closed itemsets "support-wise". It guarantees that all itemsets with support exceeding the current checkpoint's support have been found before it … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
2
2
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 12 publications
0
6
0
Order By: Relevance
“…Since survey [31] mentioned frequent itemset mining (FIM) as a tool to identify strong associations between allelic combinations associated with diseases, the proposed algorithm needs further comparison with other approaches from FIM like DeBi [32] and anytime discovery approaches like Alpine [33] tested on GEA datasets as well; though their use may get complicated if we need to keep information about object names for decision-makers. It also requires further time complexity improvements to increase the scalability and quality of the extensive bicluster finding process for massive datasets.…”
Section: Resultsmentioning
confidence: 99%
“…Since survey [31] mentioned frequent itemset mining (FIM) as a tool to identify strong associations between allelic combinations associated with diseases, the proposed algorithm needs further comparison with other approaches from FIM like DeBi [32] and anytime discovery approaches like Alpine [33] tested on GEA datasets as well; though their use may get complicated if we need to keep information about object names for decision-makers. It also requires further time complexity improvements to increase the scalability and quality of the extensive bicluster finding process for massive datasets.…”
Section: Resultsmentioning
confidence: 99%
“…Here, we clarify our contribution through the comparison with the related works. There are many previous works (Xin et al 2005;Cheng et al 2006Cheng et al , 2008Song et al 2007;Boley et al 2009Boley et al , 2010Liu et al 2012;Quadrana et al 2015;Hu and Imielinski 2017) on PC mining that deal with lossy condensed representations of the FIs. However, most of those previous works, expect for Song et al (2007), Cheng et al (2008), and Quadrana et al (2015), are oriented to a transactional database that tolerates multiple scanning and allows us to assume a stable distribution of occurrences.…”
Section: Related Workmentioning
confidence: 99%
“…the corresponding closed itemset. Nowadays, there exist very efficient algorithms for computing frequent closed itemsets (Hu and Imielinski, 2017;Uno et al, 2005). Even for a low frequency threshold, they are able to efficiently generate an exponential number of closed itemsets.…”
Section: Introductionmentioning
confidence: 99%