Abstract. Building distributed content-based publish/subscribe systems has remained a challenge. Existing solutions typically use a relatively small set of trusted computers as brokers, which may lead to scalability concerns for large Internet-scale workloads. Moreover, since each broker maintains state for a large number of users, it may be difficult to tolerate faults at each broker. In this paper we propose an approach to building content-based publish/subscribe systems on top of distributed hash table (DHT) systems. DHT systems have been effectively used for scalable and fault-tolerant resource lookup in large peer-to-peer networks. Our approach provides predicate-based query semantics and supports constrained range queries. Experimental evaluation shows that our approach is scalable to thousands of brokers, although proper tuning is required.
Multicore processors contain new hardware characteristics that are different from previous generation single-core systems or traditional SMP (symmetric multiprocessing) multiprocessor systems. These new characteristics provide new performance opportunities and challenges. In this paper, we show how hardware performance monitors can be used to provide a fine-grained, closely-coupled feedback loop to dynamic optimizations done by a multicore-aware operating system. These multicore optimizations are possible due to the advanced capabilities of hardware performance monitoring units currently found in commodity processors, such as execution pipeline stall breakdown and data address sampling. We demonstrate three case studies on how a multicore-aware operating system can use these online capabilities for (1) determining cache partition sizes, which helps reduce contention in the shared cache among applications, (2) detecting memory regions with bad cache usage, which helps in isolating these regions to reduce cache pollution, and (3) detecting sharing among threads, which helps in clustering threads to improve locality. Using realistic applications from standard benchmark suites, the following performance improvements were achieved: (1) up to 27% improvement in IPC (instructions-per-cycle) due to cache partition sizing; (2) up to 10% reduction in cache miss rates due to reduced cache pollution, resulting in up to 7% improvement in IPC; and (3) up to 70% reduction in remote cache accesses due to thread clustering, resulting in up to 7% application-level improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.