Given large data streams of items, each attributable to a certain key and possessing a certain volume, the aggregate volume associated with a key is difficult to estimate in a way that is both efficient and accurate. On the one hand, exact counting with dedicated counters incurs unacceptable overhead during stream processing. On the other hand, sketch algorithms, i.e., approximate-counting techniques that share counters among keys, have suffered from a trade-off between accuracy and query efficiency: Classic sketch algorithms allow to compute rough estimates in an efficient way, whereas more recent proposals yield highly accurate estimates at the cost of greatly increased computation time.
In this work, we propose three sketch algorithms that overcome this trade-off, computing highly accurate estimates with lightweight procedures. To reconcile these desiderata, we employ novel estimation methods that rely on Bayesian probability theory, counter-cardinality information, and basic machine-learning techniques. The combination of these techniques enables highly accurate estimates, which we demonstrate by both a theoretical worst-case analysis and an experimental evaluation. Concretely, our sketches allow to efficiently produce volume estimates with an average relative error of < 4%, which previous methods could only achieve with computations that are several orders of magnitude more expensive.