The last years have seen a vast diversification on the database market. In contrast to the "one-size-fits-all" paradigm according to which systems have been designed in the past, today's database management systems (DBMS) are tuned for particular workloads. This has led to DBMSs optimized for high performance, high throughput read / write workloads in online transaction processing (OLTP) and systems optimized for complex analytical queries (OLAP). However, this approach reaches a limit when systems have to deal with mixed workloads that are neither pure OLAP nor pure OLTP workloads. In such cases, multistores are increasingly gaining popularity. Rather than supporting one single database paradigm and addressing one particular workload, multistores encompass several DBMSs that store data in different schemas and allow to route requests on a per-query level to the most appropriate system. In this paper, we introduce the multistore ICARUS. In our evaluation based on a workload that combines OLTP and OLAP elements, we show that ICARUS is able to speed-up queries up to a factor of three by properly routing queries to the best underlying DBMS.
More and more companies move their data to the Cloud which is able to cope with the high scalability and availability demands due to its pay-as-you-go cost model. For this, databases in the Cloud are distributed and replicated across different data centers. According to the CAP theorem, distributed data management is governed by a trade-off between consistency and availability. In addition, the stronger the provided consistency level, the higher is the generated coordination overhead and thus the impact on system performance. Nevertheless, many OLTP applications demand strong consistency and use ROWA(A) for replica synchronization. ROWA(A) protocols eagerly update all (or all available) replicas and thus generate a high overhead for update transactions. In contrast, quorum-based protocols consider only a subset of sites for eager commit. This reduces the overhead for update transactions at the cost of reads, as the latter also need to access several sites. Existing quorum-based protocols do not consider the load of sites when determining the quorums; hence, they are not able to adapt at run-time to load changes. In this paper, we present QuAD, an adaptive quorum-based replication protocol that constructs quorums by dynamically selecting the optimal quorum configuration w.r.t. load and network latency. Our evaluation of QuAD based on Amazon EC2 shows that it considerably outperforms both static quorum protocols and dynamic protocols that neglect site properties in the quorum construction process. Index Terms-distributed data management; replication.
Most applications deployed in a Cloud require a high degree of availability. For the data layer, this means that data have to be replicated either within a data center or across Cloud data centers. While replication also allows to increase the performance of applications if data is read as the load can be distributed across replica sites, updates need special coordination among the sites and may have an adverse effect on the overall performance. The actual effects of data replication depend on the replication protocol used. While ROWAA (readone-write-all-available) prefers read operations, quorum-based replication protocols tend to prefer write operations as not all replica sites need to be updated synchronously. In this paper, we provide a detailed evaluation of ROWAA and quorum-based replication protocols in an amazon AWS Cloud environment on the basis of the TPC-C benchmark and different transaction mixes. The evaluation results for single data center and multi data center environments show that in general the influence of transaction coordination significantly grows with the number of update sites and a growing number of update transactions. However, not all quorum-based protocols are well suited for high update loads as they may create a hot spot that again significantly impacts performance.
Applications deployed in the Cloud usually come with dedicated performance and availability requirements. This can be achieved by replicating data across several sites and / or by partitioning data. Data replication allows to parallelize read requests and thus to decrease data access latency, but induces significant overhead for the synchronization of updates. Partitioning, in contrast, is highly beneficial if all the data accessed by an application is located at the same site, but again necessitates coordination if distributed transactions are needed to serve applications. In this paper, we analyze three protocols for distributed data management in the Cloud, namely Read-One-Write-All-Available (ROWAA), Majority Quorum (MQ) and Data Partitioning (DP)-all in a configuration that guarantees strong consistency. We introduce BEOWULF, a meta protocol based on a comprehensive cost model that integrates the three protocols and that dynamically selects the protocol with the lowest latency for a given workload. In the evaluation, we compare the prediction of the BEOWULF cost model with a baseline evaluation. The results nicely show the effectiveness of the analytical model and the precision in selecting the best suited protocol for a given workload.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.