Abstract-A classifier consists of a set of rules for classifying packets based on header fields. Because core routers can have fairly large (e.g., 2000 rule) database and must use limited SRAM to meet OC-768 speeds, the best existing classification algorithms (RFC, HiCuts, ABV) are precluded because of the large amount of memory they need. Thus the general belief is that hardware solutions like CAMs are needed, despite the amount of board area and power they consume. In this paper, we provide an alternative to CAMs via an Extended Grid-of-Tries with Path Compression (EGT-PC) algorithm whose worst-case speed scales well with database size while using a minimal amount of memory. Our evaluation is based on real databases used by Tier 1 ISPs, and synthetic databases. EGT-PC is based on a observation that we found holds for all the Tier 1 databases we studied: regardless of database size, any packet matches only a small number of distinct source-destination prefix pairs. The code we wrote for EGT-PC, RFC, HiCuts, and ABV is publicly available [16], providing the first publicly available code to encourage experimentation with classification algorithms.
Monitoring network traffic and detecting unwanted applications has become a challenging problem, since many applications obfuscate their traffic using unregistered port numbers or payload encryption. Apart from some notable exceptions, most traffic monitoring tools use two types of approaches: (a) keeping traffic statistics such as packet sizes and interarrivals, flow counts, byte volumes, etc., or (b) analyzing packet content. In this paper, we propose the use of Traffic Dispersion Graphs (TDGs) as a way to monitor, analyze, and visualize network traffic. TDGs model the social behavior of hosts ("who talks to whom"), where the edges can be defined to represent different interactions (e.g. the exchange of a certain number or type of packets). With the introduction of TDGs, we are able to harness a wealth of tools and graph modeling techniques from a diverse set of disciplines.
In traffic monitoring, accounting, and network anomaly detection, it is often important to be able to detect high-volume traffic clusters in near real-time. Such heavy-hitter traffic clusters are often hierarchical (i.e., they may occur at different aggregation levels like ranges of IP addresses) and possibly multidimensional (i.e., they may involve the combination of different IP header fields like IP addresses, port numbers, and protocol). Without prior knowledge about the precise structures of such traffic clusters, a naive approach would require the monitoring system to examine all possible combinations of aggregates in order to detect the heavy hitters, which can be prohibitive in terms of computation resources.In this paper, we focus on online identification of 1-dimensional and 2-dimensional hierarchical heavy hitters (HHHs), arguably the two most important scenarios in traffic analysis. We show that the problem of HHH detection can be transformed to one of dynamic packet classification by taking a top-down approach and adaptively creating new rules to match HHHs. We then adapt several existing static packet classification algorithms to support dynamic packet classification. The resulting HHH detection algorithms have much lower worst-case update costs than existing algorithms and can provide tunable deterministic accuracy guarantees. As an application of these algorithms, we also propose robust techniques to detect changes among heavy-hitter traffic clusters. Our techniques can accommodate variability due to sampling that is increasingly used in network measurement. Evaluation based on real Internet traces collected at a Tier-1 ISP suggests that these techniques are remarkably accurate and efficient.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.