On a high-speed link, there may be tens of millions of IP packets per second and millions of active flows. Maintaining the state of each flow is a fundamental task underlying many network functions, such as load balancing and network anomaly detection. There are two important kinds of per-flow states: perflow size (e.g., the number of packets received by an arbitrary destination IP) and per-flow cardinality (e.g., the number of distinct source IP addresses that contacted each destination IP). In this paper, we focus on the latter kind of states, and define a new problem: online query of per-flow cardinality, in which we query any given flow's cardinality entirely on the data plane with low time complexity. For this problem, we propose three solutions named On-vHLL, Ton-vHLL and Aton-vHLL, whose time cost are O(1) even for the query operation. Our proposed techniques are three folds. First, we redesign the traditional vHLL with new supplementary data structures called incremental update units (IUUs). When a certain flow's cardinality is queried, these IUUs can avoid scanning the whole data structure and reduce the time complexity to O(1). Second, we apply a HLL register compression technique called TailCut to the On-vHLL sketch, which can save memory cost by 50%. Third, we add a prefilter based on min-heap, alongside the Ton-vHLL sketch. The prefilter is to give each currently sampled top-k superspreader a dedicated HyperLogLog estimator for better accuracy. It can also absorb the superspreaders' packets bypassing the sketch. We evaluate our new sketches by simulation with CAIDA traces. The results show that our On-vHLL, Ton-vHLL and Aton-vHLL sketches need about 5 memory accesses per packet. The time cost of query operation decreases by hundreds of times than the traditional vHLL that can only be queried offline. Meanwhile, the estimation error of flow spread by our Aton-vHLL is comparable to vHLL.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.