Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining 2003
DOI: 10.1145/956750.956807
|View full text |Cite
|
Sign up to set email alerts
|

Finding recent frequent itemsets adaptively over online data streams

Abstract: A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Consequently, the knowledge embedded in a data stream is more likely to be changed as time goes by. Identifying the recent change of a data stream, specially for an online data stream, can provide valuable information for the analysis of the data stream. In addition, monitoring the continuous variation of a data stream enables to find the gradual change of embedded knowledge. However, most of mining algorithm… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
87
0
3

Year Published

2005
2005
2017
2017

Publication Types

Select...
4
3
1

Relationship

0
8

Authors

Journals

citations
Cited by 228 publications
(90 citation statements)
references
References 15 publications
0
87
0
3
Order By: Relevance
“…There has been much work to find frequent itemsets (and their variations) in the off-line setting, often starting from the A priori [1] and FP-Tree algorithms [21]. These concepts have been adapted to work over streams of data, generating algorithms such as FUP [8], and FP-stream [20]. A limitation of finding frequent itemsets is that the number of possibly frequent itemsets can become very large, meaning that the algorithm either has to track information about many candidates, or else aggressively prune the retained data, and risk missing out on some frequent itemsets.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…There has been much work to find frequent itemsets (and their variations) in the off-line setting, often starting from the A priori [1] and FP-Tree algorithms [21]. These concepts have been adapted to work over streams of data, generating algorithms such as FUP [8], and FP-stream [20]. A limitation of finding frequent itemsets is that the number of possibly frequent itemsets can become very large, meaning that the algorithm either has to track information about many candidates, or else aggressively prune the retained data, and risk missing out on some frequent itemsets.…”
Section: Related Workmentioning
confidence: 99%
“…We now compare all the algorithms on the truly sparse synthetic data, for a stream of length 10 8 . This data has a much smaller number of conditional heavy hitters compared to the number of parent items.…”
Section: Performance On Sparse Datamentioning
confidence: 99%
“…Most of the achievements related to frequent itemset mining in stream data [21][22][23][24][25][26][27][28][29][30][31] focus on this issue. In 2002, Datar proposed Ref.…”
Section: General Frequent Itemset Miningmentioning
confidence: 99%
“…It is likely that the embedded knowledge in a data stream will change quickly as time goes by. In order to catch the recent trend of data, the estDec algorithm [2] decayed the old occurrences of each itemset to diminish the effect of old transactions on the mining result of frequent itemsets in the data steam. However, in particular applications, it is interested only the frequent patterns mined from the recently arriving data within a fixed time period.…”
Section: Introductionmentioning
confidence: 99%
“…Although the problem of mining frequent itemsets over data streams has been investigated in the above literatures [2][3][8] [9][11], the temporal relations among data items were not considered in these studies. Accordingly, it is essential to provide a data structure for maintaining sequential information of items within a sliding window to discover recently repeating patterns in the window.…”
Section: Introductionmentioning
confidence: 99%