Association rule mining is an important research topic in data mining. Association rule mining consists of two steps: finding frequent itemsets and then extracting interesting rules from the frequent itemsets. In the first step, efficiency is important since discovering frequent itemsets is computationally time consuming. In the second step, unbiased assessment is important for good decision making. In this paper, we deal with both the efficiency of the mining algorithm and the measure of interest of the resulting rules. First, we present an algorithm for finding frequent itemsets that uses a vertical database. We also introduce a modified vertical data format to reduce the size of the database and an itemset reordering strategy to reduce the size of the intermediate tidsets. Second, we present a new measure to evaluate the interest of the resulting association rules. Our performance analysis shows that our proposed algorithm reduces the size of the intermediate tidsets that are generated during the mining process. The smaller tidsets make intersection operations faster. Using our interest-measuring test helps to avoid the discovery of misleading rules.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.