Network management requires accurate estimates of metrics for many applications including traffic engineering (e.g., heavy hitters), anomaly detection (e.g., entropy of source addresses), and security (e.g., DDoS detection). Obtaining accurate estimates given router CPU and memory constraints is a challenging problem. Existing approaches fall in one of two undesirable extremes: (1) low fidelity generalpurpose approaches such as sampling, or (2) high fidelity but complex algorithms customized to specific applicationlevel metrics. Ideally, a solution should be both general (i.e., supports many applications) and provide accuracy comparable to custom algorithms. This paper presents Univ-Mon, a framework for flow monitoring which leverages recent theoretical advances and demonstrates that it is possible to achieve both generality and high accuracy. UnivMon uses an application-agnostic data plane monitoring primitive; different (and possibly unforeseen) estimation algorithms run in the control plane, and use the statistics from the data plane to compute application-level metrics. We present a proofof-concept implementation of UnivMon using P4 and develop simple coordination techniques to provide a "one-bigswitch" abstraction for network-wide monitoring. We evaluate the effectiveness of UnivMon using a range of tracedriven evaluations and show that it offers comparable (and sometimes better) accuracy relative to custom sketching solutions across a range of monitoring tasks.
Today's large-scale services (
e.g.
, video streaming platforms, data centers, sensor grids) need diverse real-time summary statistics across multiple subpopulations of multidimensional datasets. However, state-of-the-art frameworks do not offer general and accurate analytics in real time at reasonable costs. The root cause is the combinatorial explosion of data subpopulations and the diversity of summary statistics we need to monitor simultaneously. We present Hydra, an efficient framework for multidimensional analytics that presents a novel combination of using a "sketch of sketches" to avoid the overhead of monitoring exponentially-many subpopulations and universal sketching to ensure accurate estimates for multiple statistics. We build Hydra as an Apache Spark plugin and address practical system challenges to minimize overheads at scale. Across multiple real-world and synthetic multidimensional datasets, we show that Hydra can achieve robust error bounds and is an order of magnitude more efficient in terms of operational cost and memory footprint than existing frameworks (e.g., Spark, Druid) while ensuring interactive estimation times.
Sketching algorithms or sketches have emerged as a promising alternative to the traditional packet sampling-based network telemetry solutions. At a high level, they are attractive because of their high resource efficiency and accuracy guarantees. While there have been significant recent advances in various aspects of sketching for networking tasks, many fundamental challenges remain unsolved that are likely stumbling blocks for adoption. Our contribution in this paper is in identifying and formulating these research challenges across the ecosystem encompassing network operators, platform vendors/developers, and algorithm designers. We hope that these serve as a necessary fillip for the community to enable the broader adoption of sketch-based telemetry.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.