Software Transactional Memory (STM) systems have emerged as a powerful paradigm to develop concurrent applications. At current date, however, the problem of how to build distributed and replicated STMs to enhance both dependability and performance is still largely unexplored. This time by a non-blocking distributed certification scheme, which we name BFC (Bloom Filter Certification). BFC exploits a novel Bloom Filter-based encoding mechanism that permits to significantly reduce the overheads of replica coordination at the cost of a user tunable increase in the probability of transaction abort. Through an extensive experimental study based on standard STM benchmarks we show that the BFC scheme permits to achieve remarkable performance gains even for negligible (e.g. 1%) increases of the transaction abort rate.
Abstract-Total Order Broadcast (TOB) is a fundamental building block at the core of a number of strongly consistent, fault-tolerant replication schemes. While it is widely known that the performance of existing TOB algorithms varies greatly depending on the workload and deployment scenarios, the problem of how to forecast their performance in realistic settings is, at current date, still largely unexplored.In this paper we address this problem by exploring the possibility of leveraging on machine learning techniques for building, in a fully decentralized fashion, performance models of TOB protocols. Based on an extensive experimental study considering heterogeneous workloads and multiple TOB protocols, we assess the accuracy and efficiency of alternative machine learning methods including neural networks, support vector machines, and decision tree-based regression models. We propose two heuristics for the feature selection phase that allow to reduce its execution time up to two orders of magnitude incurring in a very limited loss of prediction accuracy.
Abstract. In-memory NoSQL transactional data grids are emerging as an attractive alternative to conventional relational distributed databases. In these platforms, replication plays a role of paramount importance, as it represents the key mechanism to ensure data durability. In this work we focus on Atomic Broadcast (AB) based certification replication schemes, which have recently emerged as a much more scalable alternative to classical replication protocols based on active replication or atomic commit protocols. We first show that, among the existing AB-based certification protocols, no "one-fits-all" solution exists that achieves optimal performance in presence of heterogeneous workloads. Next, we present PolyCert, a polymorphic certification protocol that allows for the concurrent coexistence of different certification protocols, relying on machine-learning techniques to determine the optimal certification scheme on a per transaction basis. We design and evaluate two alternative oracles, based on parameter-free machine learning techniques that rely both on off-line and on-line training approaches. Our experimental results demonstrate the effectiveness of the proposed approach, highlighting that PolyCert is capable of achieving a performance extremely close to that of an optimal non-adaptive certification protocol in presence of non heterogeneous workloads, and significantly outperform any non-adaptive protocol when used with realistic, complex applications that generate heterogeneous workloads.
Abstract-Replication plays an essential role for in-memory distributed transactional platforms, given that it represents the primary means to ensure data durability. Unfortunately, no single replication technique can ensure optimal performance across a wide range of workloads and system configurations. This paper tackles this problem by presenting MORPHR, a framework that allows to automatically adapt the replication protocol of in-memory transactional platforms according to the current operational conditions. MORPHR presents two key innovative aspects. On one hand, it allows to plug in, in a modular fashion, specialized algorithms to regulate the switching between arbitrary replication protocols. On the other hand, MORPHR relies on state of the art machine learning techniques to autonomously determine the best replication in face of varying workloads. We integrated MORPHR in an open-source in-memory NoSQL data grid, and evaluated it by means of an extensive experimental study. The results highlight that MORPHR is accurate in identifying the best replication strategy in presence of complex realistic workloads, and does so with minimal overhead.
Abstract-Nowadays, distributed in-memory caches are increasingly used as a way to improve the performance of applications that require frequent access to large amounts of data. In order to maximize performance and scalability, these platforms typically rely on weakly consistent partial replication mechanisms. These schemes partition the data across the nodes and ensure a predefined (and typically very small) replication degree, thus maximizing the global memory capacity of the platform and ensuring that the cost to ensure replica consistency remains constant as the scale of the platform grows. Moreover, even though several of these platforms provide transactional support, they typically sacrifice consistency, ensuring guarantees that are weaker than classic 1-copy serializability, but that allow for more efficient implementations.This paper proposes and evaluates two partial replication techniques, providing different (weak) consistency guarantees, but having in common the reliance on total order multicast primitives to serialize transactions without incurring in distributed deadlocks, a main source of inefficiency of classical two-phase commit (2PC) based replication mechanisms.We integrate the proposed replication schemes into Infinispan, a prominent open-source distributed in-memory cache, which represents the reference clustering solution for the well-known JBoss AS platform. Our performance evaluation highlights speed-ups of up to 40x when using the proposed algorithms with respect to the native Infinispan replication mechanism, which relies on classic 2PC-based replication.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.