Transactional Causal Consistency (TCC) extends causal consistency, the strongest consistency model compatible with availability, with interactive read-write transactions, and is therefore particularly appealing for geo-replicated platforms. This paper presents Wren, the first TCC system that at the same time i) implements nonblocking read operations, thereby achieving low latency, and ii) allows an application to efficiently scale out within a replication site by sharding. Wren introduces new protocols for transaction execution, dependency tracking and stabilization. The transaction protocol supports nonblocking reads by providing a transaction with a snapshot that is the union of a fresh causal snapshot S installed by every partition in the local data center and a client-side cache for writes that are not yet included in S. The dependency tracking and stabilization protocols require only two scalar timestamps, resulting in efficient resource utilization and providing scalability in terms of replication sites. In return for these benefits, Wren slightly increases the visibility latency of updates. We evaluate Wren on an AWS deployment using up to 5 replication sites and 16 partitions per site. We show that Wren delivers up to 1.4x higher throughput and up to 3.6x lower latency when compared to the state-of-the-art design. The choice of an older snapshot increases local update visibility latency by a few milliseconds. The use of only two timestamps to track causality increases remote update visibility latency by less than 15%.
Abstract-Causal consistency is an attractive consistency model for geo-replicated data stores because it hits a sweet spot in the ease of programmability vs performance trade-off.In this paper we propose a new approach to causal consistency, which we call Optimistic Causal Consistency (OCC). The optimism of our approach lies in the fact that updates from a remote data center are immediately made visible to clients in the local data center. A client, hence, always reads the freshest version of an item, whose dependencies, however, might have not been installed in the local data center yet. When serving a read request, a server can detect whether it has not received such dependencies yet. This is achieved without inter-server synchronization thanks to cheap dependency meta-data supplied by the client. Upon detecting a missing dependency, the server waits to receive it.This approach contrasts with the design of existing systems, which are prone to expose stale versions of a data items, to ensure that clients only see versions whose dependencies have already been replicated in the local data center.OCC explores a novel trade-off in the landscape of consistency models. Because network partitions are practically rare events, OCC partially trades availability to improve other performance metrics. On the one side, OCC maximizes the freshness of data returned to clients and reduces the communication overhead. On the other side, a server might need to wait before serving a client's request, leading the system to be unavailable in case of a network partition. To overcome this limitation, we propose a recovery mechanism that allows an OCC system to fall back to a pessimistic protocol to recover availability.We implement OCC in a new system, which we call POCC. We compare POCC against a recent (pessimistic) approach to causal consistency using heterogeneous workloads on an Amazon AWS deployment encompassing up to 96 nodes scattered over 3 data centers. We show that POCC is able to maximize the freshness of data returned to client while providing comparable or better performance than its pessimistic counterpart in a wide range of production-like workloads.
Geo-replicated data platforms are at the backbone of several large-scale online services. Transactional Causal Consistency (TCC) is an attractive consistency level for building such platforms. TCC avoids many anomalies of eventual consistency, eschews the synchronization costs of strong consistency, and supports interactive read-write transactions. Partial replication is another attractive design choice for building geo-replicated platforms, as it increases the storage capacity and reduces update propagation costs.This paper presents PaRiS, the first TCC system that supports partial replication and implements non-blocking parallel read operations, whose latency is paramount for the performance of read-intensive applications. PaRiS relies on a novel protocol to track dependencies, called Universal Stable Time (UST). By means of a lightweight background gossip process, UST identifies a snapshot of the data that has been installed by every DC in the system. Hence, transactions can consistently read from such a snapshot on any server in any replication site without having to block. Moreover, PaRiS requires only one timestamp to track dependencies and define transactional snapshots, thereby achieving resource efficiency and scalability.We evaluate PaRiS on a large-scale AWS deployment composed of up to 10 replication sites. We show that PaRiS scales well with the number of DCs and partitions, while being able to handle larger data-sets than existing solutions that assume full replication. We also demonstrate a performance gain of non-blocking reads vs. a blocking alternative (up to 1.47x higher throughput with 5.91x lower latency for read-dominated workloads and up to 1.46x higher throughput with 20.56x lower latency for writeheavy workloads).
Causal consistency is an attractive consistency model for geo-replicated data stores because it hits a sweet spot in the ease of programmability vs performance trade-off. In this paper we propose a new approach to causal consistency, which we call Optimistic Causal Consistency (OCC). The optimism of our approach lies in the fact that updates from a remote data center are immediately made visible to clients in the local data center. A client, hence, always reads the freshest version of an item, whose dependencies, however, might have not been installed in the local data center yet. When serving a read request, a server can detect whether it has not received such dependencies yet. This is achieved without inter-server synchronization thanks to cheap dependency meta-data supplied by the client. Upon detecting a missing dependency, the server waits to receive it. This approach contrasts with the design of existing systems, which are prone to expose stale versions of a data items, to ensure that clients only see versions whose dependencies have already been replicated in the local data center. OCC explores a novel trade-off in the landscape of consistency models. Because network partitions are practically rare events, OCC partially trades availability to improve other performance metrics. On the one side, OCC maximizes the freshness of data returned to clients and reduces the communication overhead. On the other side, a server might need to wait before serving a client's request, leading the system to be unavailable in case of a network partition. To overcome this limitation, we propose a recovery mechanism that allows an OCC system to fall back to a pessimistic protocol to recover availability. We implement OCC in a new system, which we call POCC. We compare POCC against a recent (pessimistic) approach to causal consistency using heterogeneous workloads on an Amazon AWS deployment encompassing up to 96 nodes scattered over 3 data centers. We show that POCC is able to maximize the freshness of data returned to client while providing comparable or better performance than its pessimistic counterpart in a wide range of production-like workloads.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.