The mining of frequent weighted patterns (FWPs) that considers the different semantic significance (weight) of items is more suitable for practice than the mining of frequent patterns. Therefore, it plays a vital role in real-world scenarios. However, there exist several limitations when applying methods for mining FWPs designed for static data on growth datasets, especially data streams. Hence, this study proposes an algorithm for mining FWPs over data streams. First, we introduce the concept of mining FWPs over data streams via a sliding window model. Then, we introduce a modification of the weighted node tree (WN-tree) named SWN-tree that has the ability to maintain the information over data streams. Next, this study develops a method for mining FWPs over data streams employing a sliding window model based on SWN-tree. This method is called FWPODS (Frequent Weighted Patterns Over Data Stream) algorithm. Finally, we conduct empirical experiments to compare the performances of our approach and the state-of-the-art algorithm (NFWI) for mining FWPs over data streams. The results of experiment indicate that our approach outperforms the NFWI algorithm when running in batch mode in a sliding window.INDEX TERMS pattern mining, data streams, frequent weighted patterns, sliding window model.
Mining frequent weighted itemsets (FWIs) from weighted-item transaction databases has recently received research interest. In real-world applications, sparse weighted-item transaction databases (SWITDs) are common. For example, supermarkets have many items, but each transaction has a small number of items. In this paper, we propose an interval word segment (IWS) structure to store and process tidsets for enhancing the effectiveness of mining FWIs from SWITDs. The IWS structure allows the intersection of tidsets between two itemsets to be performed very fast. A map array is proposed for storing a 1-bit index for words. From the map array, 1-bits are mapped to create
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.