With the rapidly growing uses of World Wide Web for various important and sensitive purposes it becomes a sensible necessity to find out the interesting web access patterns from the web access sequences tracked by users frequently. Web access sequential patterns can be used to achieve business intelligence for e-commerce sites and also can be used to analyze system performance. This paper proposes a more efficient web mining algorithm which mines all the sequential patterns from the web access sequences and totally eliminates the concept of linking between nodes. The algorithm uses the aggregate tree structure for mining and then mines from the tree using RST (Root-set of Suffix Trees) for same prefix items. The algorithm finds the frequent sequential patterns by recursively traversing the tree from root-nodes to child-nodes for the length-1 frequent items. The proposed approach doesn't need to generate any projected tree; it needs only the root-set for each prefix that got in previous step. Experimental results show huge performance gain over the FOF and WAPtree mining techniques by considerably reducing the mining time.
KeywordsFrequent sequential pattern, Web access sequence, Web log mining, WAP-tree, First-Occurrence Forest (FOF), and Rootset of Suffix Tree (RST).
In this paper, we have proposed an Incremental Sequential Pattern Tree mining algorithm to retrieve new updated frequent sequential patterns from dynamic sequence database. Sequential Pattern Tree stores both frequent and non-frequent items from the old sequence database. So that, our proposed algorithm updates the old Sequential Pattern Tree by scanning only the new sequences, does not require to scan the whole updated database (old + new) that reduces the execution time in reconstructing the tree. We have compared our proposed incremental mining approach with three existing algorithms those are GSP, PrefixSpan and FUSP-Tree Based mining and we have got satisfactory results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.