Data mining helps in doing automated extraction and generating predictive information from large amount of data. The association rule mining is one of the important area of research in Data mining. The Association rule mining identifies the useful associations or relationship among big set of data items. In this paper, we provide the important concepts of Association rule mining and existing algorithms and their effectiveness and drawbacks. The references provided in this paper covered the main theoretical issues and guiding the researcher in an interesting research directions that have yet to be discovered.
Abstract:In Recent years, Data mining is an essential technique to discover useful knowledge from transactional dataset. Association analysis algorithm is one of the vital data mining techniques. It normally catches relationships among items in transactional dataset. Generally they are used to develop the strategy of the future business. The main step of association analysis is to catch the frequent patterns in large transactional dataset. Plenty of methods are available in the literature to catch the frequent patterns. Most of the techniques gave in the literature catch all frequent itemsets for a specified minimum support threshold value. But in some instance, it is desired to examine the existence of some of the few targeted patterns (for example special offer given for group of items to promote the retail sales) in large transactional dataset to develop the strategy of the future business.
Data is being produced in new forms and unimaginable quantities. Researches and other scientific and commercial applications are engrossing the scientific community for their size and need of faster accessibility. The conventional access methods previously available in multidimensional databases are no longer suitable for the new form of data produced. In traditional databases, multicolumn index is created using B-tree [5]. This indexing cannot slide over columns, so the primary index column must be in the WHERE clause filters of the query. The R-tree [3], an extension of the B-tree, is a hierarchical, height balanced multidimensional indexing structure that guarantees space utilization above a certain threshold. But the data produced in most of the cases are not spatial in nature. Therefore, the data should be restructured in order to map the non-spatial data to geometric space. Thus, the multidimensional accessibility of spatial access methods, experimented on non spatial data for the first time and the analysis of which has produced interesting results forms the major contributions of this paper. The sequence of procedures followed to arrive at the analytical study is as follows:1. The packing of non spatial data converts the data into a form that paves the way for multidimensional access, similar to using spatial access methods for spatial data. 2. The proposal of reduction of overlap of data using Hilbert curves for ordering the data before insertion into the proposed indexing structure 3. The proposal of a new index structure, Hilbert ZR+ Tree [HZR+ Tree]. 4. A collection of experiments and analysis which validates and proves the efficiency of the proposed data model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.