“…A calendar schema (CS) is a relational schema that is based on the calendar concept hierarchy [9], [21]. A CS is defined to comprise a set of calendar-based time granularities and corresponding possible domain values.…”
“…The transaction DB used in the experiments is composed of 17 distinct items, and the average length of the transactions is 12. The execution times between the algorithm proposed in this paper and the traditional classification based on temporal class-association rules (CTAR) [21] algorithm are compared when executed on one machine (see Fig. 9).…”
“…The accuracy of the proposed classification model in predicting the customer characterization group was evaluated based on the precision, recall, and F 1 -value of (3) through (5). Thresholds such as the support, confidence, and frequency involved in the generation of classification rules were experimented on in the same way as our previous works [21]; thus, the detailed process of the test was omitted, and we used the optimal parameter setting values from the test results. All evaluation measures were obtained using the methodology of a stratified 10-fold cross validation for all six classes.…”
“…In this section, the methodology using the MapReduce framework is proposed for a market basket analysis using the TACRs algorithm. The TACRs algorithm was applied for intrusion detection in our previous work [21]. Therefore, the concept of TACRs is briefly recalled first, and a description of an extended algorithm using a distributed programming method based on MapReduce will follow.…”
A reliable analysis of consumer preference from a large amount of purchase data acquired in real time and an accurate customer characterization technique are essential for successful direct marketing campaigns. In this study, an optimal segmentation of post office customers in Korea is performed using a subspace projection–based clustering method to generate an accurate customer characterization from a high‐dimensional census dataset. Moreover, a traditional temporal mining method is extended to an algorithm using the MapReduce framework for a consumer preference analysis. The experimental results show that it is possible to use parallel mining through a MapReduce‐based algorithm and that the execution time of the algorithm is faster than that of a traditional method.
“…A calendar schema (CS) is a relational schema that is based on the calendar concept hierarchy [9], [21]. A CS is defined to comprise a set of calendar-based time granularities and corresponding possible domain values.…”
“…The transaction DB used in the experiments is composed of 17 distinct items, and the average length of the transactions is 12. The execution times between the algorithm proposed in this paper and the traditional classification based on temporal class-association rules (CTAR) [21] algorithm are compared when executed on one machine (see Fig. 9).…”
“…The accuracy of the proposed classification model in predicting the customer characterization group was evaluated based on the precision, recall, and F 1 -value of (3) through (5). Thresholds such as the support, confidence, and frequency involved in the generation of classification rules were experimented on in the same way as our previous works [21]; thus, the detailed process of the test was omitted, and we used the optimal parameter setting values from the test results. All evaluation measures were obtained using the methodology of a stratified 10-fold cross validation for all six classes.…”
“…In this section, the methodology using the MapReduce framework is proposed for a market basket analysis using the TACRs algorithm. The TACRs algorithm was applied for intrusion detection in our previous work [21]. Therefore, the concept of TACRs is briefly recalled first, and a description of an extended algorithm using a distributed programming method based on MapReduce will follow.…”
A reliable analysis of consumer preference from a large amount of purchase data acquired in real time and an accurate customer characterization technique are essential for successful direct marketing campaigns. In this study, an optimal segmentation of post office customers in Korea is performed using a subspace projection–based clustering method to generate an accurate customer characterization from a high‐dimensional census dataset. Moreover, a traditional temporal mining method is extended to an algorithm using the MapReduce framework for a consumer preference analysis. The experimental results show that it is possible to use parallel mining through a MapReduce‐based algorithm and that the execution time of the algorithm is faster than that of a traditional method.
“…And also many candidate items are generated, so it takes too long time to perform the candidate sets. Therefore in this paper to reduce the time and space complexity, we applied FP-growth method with depth-first technique to find out the similar XML query patterns [1,3,7,9,10].…”
XML data are increasing in many areas including internet and public documentation. XML data change dynamically while processing the query. Many kinds of techniques have been researched to speed up the query performance about XML data structures. In this paper, based on the XML structure, we analyze the query pattern and propose the data mining technique about extracting the similar query pattern by the users. In this paper to speed up the performance we used FPgrowth algorithm for mining similar query patterns about the XML data structure. We confirmed that the proposed method using FP-growth algorithm applied to XML query subtrees outperforms Apriori algorithm. The proposed method gives the fast query result about the repeatedly occurring queries.18th International Workshop on Database and Expert Systems Applications 1529-4188/07 $25.00
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.