Associative memories offer high levels of parallelism in matching a query against stored entries. We design and analyze an architecture which uses a single lookup into a Ternary Content Addressable Memory (TCAM) to solve the subset query problem for small sets, i.e., to check whether a given set (the query) contains (or alternately, is contained in) any one of a large collection of sets in a database. We use each TCAM entry as a small Ternary Bloom Filter (each 'bit' of which is one of {0,1," * " }) to store one of the sets in the collection. Like Bloom filters, our architecture is susceptible to false positives. Since each TCAM entry is quite small, asymptotic analyses of Bloom filters do not directly apply. Surprisingly, we are able to show that the asymptotic false positive probability formula can be safely used if we penalize the small Bloom filter by taking away just one bit of storage and adding just half an extra set element before applying the formula. We believe that this analysis is independently interesting.The subset query problem has applications in databases, network intrusion detection, packet classification in Internet routers, and Information Retrieval. We demonstrate our architecture on one illustrative streaming application -intrusion detection in network traffic. By shingling the strings in the database, we can perform a single subset query, and hence a single TCAM search, to skip many bytes in the stream. We evaluate our scheme on the open source CLAM anti-virus database, for worst-case as well as random streams. Our architecture appears to be at least one order of magnitude faster than previous approaches. Since the individual Bloom filters must fit in a single TCAM entry (currently 72 to 576 bits), our solution applies only when each set is of a small cardinality. However, this is sufficient for many typical applications. Also, recent algorithms for the subset-query problem use a small-set version as a subroutine.
Distributed hash table (DHT) systems are an important class of peer-to-peer routing infrastructures. They enable scalable wide-area storage and retrieval of information, and will support the rapid development of a wide variety of Internet-scale applications ranging from naming systems and file systems to application-layer multicast. DHT systems essentially build an overlay network, but a path on the overlay between any two nodes can be significantly different from the unicast path between those two nodes on the underlying network. As such, the lookup latency in these systems can be quite high and can adversely impact the performance of applications built on top of such systems.In this paper, we discuss a random sampling technique that incrementally improves lookup latency in DHT systems. Our sampling can be implemented using information gleaned from lookups traversing the overlay network. For this reason, we call our approach lookup-parasitic random sampling (LPRS). LPRS is fast, incurs little network overhead, and requires relatively few modifications to existing DHT systems.For idealized versions of DHT systems like Chord, Tapestry and Pastry, we analytically prove that LPRS can result in lookup latencies proportional to the average unicast latency of the network, provided the underlying physical topology has a power-law latency expansion. We then validate this analysis by implementing LPRS in the Chord simulator. Our simulations reveal that LPRS-Chord exhibits a qualitatively better latency scaling behavior relative to unmodified Chord.Finally, we provide evidence which suggests that the Internet router-level topology resembles power-law latency ex- pansion. This finding implies that LPRS has significant practical applicability as a general latency reduction technique for many DHT systems. This finding is also of independent interest since it might inform the design of latencysensitive topology models for the Internet.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.