Model-based probabilistic frequent itemset mining

Bernecker, Thomas; Cheng, Reynold; Cheung, David W.; Kriegel, Hans‐Peter; Lee, Sau Dan; Renz, Matthias; Verhein, Florian; Wang, Liang; Zuefle, Andreas

doi:10.1007/s10115-012-0561-2

Cited by 17 publications

(8 citation statements)

References 44 publications

(90 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…From another view, studies involving web-based semantic itemset mining (Lazcorreta et al , 2008), utility-based itemset mining methods (Yao and Hamilton, 2006), pre-pruned tree-based itemset mining (Meng and Sha, 2017), and possible world semantic mining (Bernecker et al , 2013) focused on mining rules by setting semantics on items. These semantic-based algorithms set semantic definitions of associations to enhance the interpretation and application of mined items.…”

Section: Related Workmentioning

confidence: 99%

A regression-based algorithm for frequent itemsets mining

Jia

Wang

2019

DTA

View full text Add to dashboard Cite

Purpose Frequent itemset mining (FIM) is a basic topic in data mining. Most FIM methods build itemset database containing all possible itemsets, and use predefined thresholds to determine whether an itemset is frequent. However, the algorithm has some deficiencies. It is more fit for discrete data rather than ordinal/continuous data, which may result in computational redundancy, and some of the results are difficult to be interpreted. The purpose of this paper is to shed light on this gap by proposing a new data mining method. Design/methodology/approach Regression pattern (RP) model will be introduced, in which the regression model and FIM method will be combined to solve the existing problems. Using a survey data of computer technology and software professional qualification examination, the multiple linear regression model is selected to mine associations between items. Findings Some interesting associations mined by the proposed algorithm and the results show that the proposed method can be applied in ordinal/continuous data mining area. The experiment of RP model shows that, compared to FIM, the computational redundancy decreased and the results contain more information. Research limitations/implications The proposed algorithm is designed for ordinal/continuous data and is expected to provide inspiration for data stream mining and unstructured data mining. Practical implications Compared to FIM, which mines associations between discrete items, RP model could mine associations between ordinal/continuous data sets. Importantly, RP model performs well in saving computational resource and mining meaningful associations. Originality/value The proposed algorithms provide a novelty view to define and mine association.

show abstract

Section: Related Workmentioning

confidence: 99%

A regression-based algorithm for frequent itemsets mining

Jia

Wang

2019

DTA

View full text Add to dashboard Cite

show abstract

“…For Itemset mining in uncertain data [39,40], two main models are generally used, (1) the expected support model [41] (an itemset X is considered frequent if and only if its expected support is not less than a user-specified support threshold) and (2) the probabilistic frequentness model [42] (an itemset X is called frequent if the probability that X occurs in at least minSup transactions is above a given threshold). In the first approach, the basic idea consists in exploiting the statistical properties of those items with low existential probabilities with a framework that comprises three modules: the trimming module, pruning module and patch up module.…”

Section: Evidential Data Miningmentioning

confidence: 99%

Data mining for decision support with uncertainty on the airplane

Sene

Kamsu-Foguem

Rumeau

2018

Data & Knowledge Engineering

View full text Add to dashboard Cite

“…Considering higher levels allows us to mine rules which would not be learned ( ⊆ )). Bernecker et al [11] performed a thorough otherwise, and to learn more concise and generalized rules. The authors propose methods which explore taxonomies to speed up the mining process.…”

Section: Related Workmentioning

confidence: 99%

Probabilistic Frequent Itemset Mining with Hierarchical Background Knowledge

Melo¹,

Völker²

2015

IJKE

View full text Add to dashboard Cite

Abstract-In the recent years, there has been significant development in the field of Probabilistic Frequent Itemset Mining (PFIM). Despite the complexity of calculating the frequentness probability of an itemset, approximation techniques allow us to reduce the complexity of the problem with very low approximation error. In this paper we investigate how to incorporate hierarchical taxonomies into the attribute uncertainty model, which assumes independence between the existential probability of items in a transaction. We propose scalable methods which can reduce noise, and ensure consistency of the transactions by approximating the dependencies between attributes implied by a background hierarchical taxonomy. We also perform experiments in order to evaluate the scalability, accuracy of the approximation, as well as the denoising performance of the proposed methods.Index Terms-Probabilistic frequent itemset mining, generalized rules, hierarchical background knowledge.

show abstract

Model-based probabilistic frequent itemset mining

Cited by 17 publications

References 44 publications

A regression-based algorithm for frequent itemsets mining

A regression-based algorithm for frequent itemsets mining

Data mining for decision support with uncertainty on the airplane

Probabilistic Frequent Itemset Mining with Hierarchical Background Knowledge

Contact Info

Product

Resources

About