2011
DOI: 10.1007/978-3-642-23783-6_12
|View full text |Cite
|
Sign up to set email alerts
|

Efficient Mining of Top Correlated Patterns Based on Null-Invariant Measures

Abstract: Abstract. Mining strong correlations from transactional databases often leads to more meaningful results than mining association rules. In such mining, null (transaction)-invariance is an important property of the correlation measures. Unfortunately, some useful null-invariant measures such as Kulczynski and Cosine, which can discover correlations even for the very unbalanced cases, lack the (anti)-monotonicity property. Thus, they could only be applied to frequent itemsets as the post-evaluation step. For lar… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
17
0
1

Year Published

2012
2012
2022
2022

Publication Types

Select...
6
1
1

Relationship

1
7

Authors

Journals

citations
Cited by 22 publications
(18 citation statements)
references
References 18 publications
0
17
0
1
Order By: Relevance
“…In addition, they do not distinguish the different requests issued from the different grid jobs and consider them as successive file accesses. Various studies have shown the limits of association rule mining based on the support and confidence approach [8,9,38,39]. They indeed confirm that it results on an excessively large number of rules, with a large part of them either redundant or not reflecting the true correlation relationship among data [13,43].…”
Section: Accepted Manuscriptmentioning
confidence: 88%
“…In addition, they do not distinguish the different requests issued from the different grid jobs and consider them as successive file accesses. Various studies have shown the limits of association rule mining based on the support and confidence approach [8,9,38,39]. They indeed confirm that it results on an excessively large number of rules, with a large part of them either redundant or not reflecting the true correlation relationship among data [13,43].…”
Section: Accepted Manuscriptmentioning
confidence: 88%
“…Experiments realised on several datasets show the efficiency of GMJP according to both quantitative and qualitative aspects. An important direction for future work is to extend our approach to other correlation measures [10,18,20,22] through classifying them into classes of measures sharing the same properties.…”
Section: Discussionmentioning
confidence: 99%
“…In [26], the authors provide a unified definition of existing null-invariant correlation measures and propose the GAMINER approach allowing the extraction of frequent high correlated patterns according to the Cosine and to the Kulczynski measures. In this same context, the NICOMINER algorithm was also proposed in [10] and it allows the extraction of correlated patterns according to the Cosine measure. In this same context, we cite also the AETHERIS approach [21] which allow the extraction of condensed representation of correlated patterns according to user's preferences.…”
Section: Related Workmentioning
confidence: 98%
“…We admit that the failure of Apriori property on correlation bring challenges in computational issues, but on the other hand it makes the mining results more effective and meaningful. Therefore, we re-examine various interestingness measures [2,29,14], in search of a measure we could use to mine correlated sequential patterns, and finally obtain the motivation of designing our correlation measure from the null hypothesis in Ngram testing.…”
Section: Measure Selectionmentioning
confidence: 99%
“…In Table 4, we list top-ranked sequential patterns (whose size is larger than two) according to four measures: support (see Definition 2.2), all-confidence [29], lift [14], and cor (Equation 1). We can see that patterns with the highest support values are mostly random combinations of popular words: even though some phrases make sense as high level concepts, e.g., 'database system', their useless duplicates may appear multiple times, such as, 'oriented database system', 'data base system', and 'object database system'.…”
Section: Study On Dblpmentioning
confidence: 99%