2011
DOI: 10.1198/jasa.2011.ap10217
|View full text |Cite
|
Sign up to set email alerts
|

Distinct Counting With a Self-Learning Bitmap

Abstract: Counting the number of distinct elements (cardinality) in a dataset is a fundamental problem in database management. In recent years, due to many of its modern applications, there has been significant interest to address the distinct counting problem in a data stream setting, where each incoming data can be seen only once and cannot be stored for long periods of time. and memory resources. However, the performances of these methods are not scale-invariant, in the sense that their relative root mean square est… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2011
2011
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(15 citation statements)
references
References 18 publications
0
15
0
Order By: Relevance
“…In addition to LPC [48] and HLL [17], there are also many methods developed to estimate cardinality for each user. In detail, [15], [7] combine LPC and different sampling methods to enlarge the estimation range of LPC. Flajolet and Martin [18] develop a sketch method FM, which uses a register to estimate the data stream's cardinality bounded by 2 w , where w is the number of bits in the register.…”
Section: Related Workmentioning
confidence: 99%
“…In addition to LPC [48] and HLL [17], there are also many methods developed to estimate cardinality for each user. In detail, [15], [7] combine LPC and different sampling methods to enlarge the estimation range of LPC. Flajolet and Martin [18] develop a sketch method FM, which uses a register to estimate the data stream's cardinality bounded by 2 w , where w is the number of bits in the register.…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, it needs to set a large m to handle data streams with large cardinalities. [17], [5] combine LPC and different sampling methods to enlarge the estimation range. Flajolet and Martin [20] develop a sketch method FM, which uses a register to estimate the data stream's cardinality and provides a cardinality estimation bounded by 2 w , where w is the number of bits in the register.…”
Section: Related Workmentioning
confidence: 99%
“…The latter algorithm is particularly well suited for large‐scale cardinality estimation problems. Chen & Cao (2009) develop an algorithm that combines hashing to bit patterns with sampling at an adaptive rate. They show empirically that their algorithm outperforms Hyper‐LogLog for small‐ to medium‐scale problems, but lack theoretical justification of this claim.…”
Section: Definitions and Historymentioning
confidence: 99%