2014
DOI: 10.4304/jsw.9.9.2361-2365
|View full text |Cite
|
Sign up to set email alerts
|

An Algorithm for Mining Frequent Itemsets from Library Big Data

Abstract: Frequent itemset mining plays an important part in college library data analysis. Because there are a lot of redundant data in library database, the mining process may generate intra-property frequent itemsets, and this hinders its efficiency significantly. To address this issue, we propose an improved FP-Growth algorithm we call RFP-Growth to avoid generating intra-property frequent itemsets, and to further boost its efficiency, implement its MapReduce version with additional prune strategy. The proposed algo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2020
2020

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 15 publications
0
3
0
Order By: Relevance
“…Following Wu et al (2004), we employed the traditional Apriori algorithm to extract the association rules from the questionnaire data. This algorithm was chosen due to its effectiveness for less frequent patterns and smaller sets of candidate association rules (Li, 2014). The Apriori algorithm, similar to other association rules mining approaches, aims to predict the occurrence of library service usage { Y } based on the given set of transactions { X }.…”
Section: Data and Resultsmentioning
confidence: 99%
“…Following Wu et al (2004), we employed the traditional Apriori algorithm to extract the association rules from the questionnaire data. This algorithm was chosen due to its effectiveness for less frequent patterns and smaller sets of candidate association rules (Li, 2014). The Apriori algorithm, similar to other association rules mining approaches, aims to predict the occurrence of library service usage { Y } based on the given set of transactions { X }.…”
Section: Data and Resultsmentioning
confidence: 99%
“…However, as demonstrated by Han et al (2000), this algorithm is not effective in situations with frequent and long patterns due to a huge number of candidate sets. In addition, Li (2014) showed that when the generation of a huge set of candidates is avoided, the library usage mining performance can be substantially improved. In this study, FP-growth (frequent pattern) algorithm (Han et al , 2000; Borgelt, 2005; Li, 2014) was therefore employed to extract association rules from the usage of library services.…”
Section: Data Collection and Research Methodologymentioning
confidence: 99%
“…In addition, Li (2014) showed that when the generation of a huge set of candidates is avoided, the library usage mining performance can be substantially improved. In this study, FP-growth (frequent pattern) algorithm (Han et al , 2000; Borgelt, 2005; Li, 2014) was therefore employed to extract association rules from the usage of library services. The algorithm iteratively reduces the minimum support (how often an association rule is applicable to the given data set of library visits) until it finds the required number of association rules with the given minimum metric.…”
Section: Data Collection and Research Methodologymentioning
confidence: 99%