2011
DOI: 10.1111/j.1467-9469.2010.00727.x
|View full text |Cite
|
Sign up to set email alerts
|

A Statistical Analysis of Probabilistic Counting Algorithms

Abstract: Abstract.  This article considers the problem of cardinality estimation in data stream applications. We present a statistical analysis of probabilistic counting algorithms, focusing on two techniques that use pseudo‐random variates to form low‐dimensional data sketches. We apply conventional statistical methods to compare probabilistic algorithms based on storing either selected order statistics, or random projections. We derive estimators of the cardinality in both cases, and show that the maximal‐term estima… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
29
0
2

Year Published

2012
2012
2019
2019

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 24 publications
(31 citation statements)
references
References 40 publications
0
29
0
2
Order By: Relevance
“…Then, µ := E [N (t)] = α 1 + α N max (14) var (N (t)) = α (1 + α) 2 N max (15) cov (N (t), (16) Thus,…”
Section: A Derivation Of the Regularization Termmentioning
confidence: 99%
“…Then, µ := E [N (t)] = α 1 + α N max (14) var (N (t)) = α (1 + α) 2 N max (15) cov (N (t), (16) Thus,…”
Section: A Derivation Of the Regularization Termmentioning
confidence: 99%
“…For two unbiased estimators using the same amount of memory, the one with a smaller relative standard error is better. For a detailed classification and comparison of existing single-flow cardinality estimation algorithms, see [10,11]. We take a closer look at the LogLog and HyperLogLog algorithms in Section 2.1.1.…”
Section: Single-flow Cardinality Estimationmentioning
confidence: 99%
“…To bound the variance, both schemes repeat the above procedures for m different hash functions and use their combined statistics for the estimation 1 . A comprehensive overview of different cardinality estimation techniques is given in [7,21]. State-of-the art cardinality estimators have a standard error of about 1/ √ m, where m is the number of storage units [6].…”
Section: Related Work and Previous Schemesmentioning
confidence: 99%