Cache Augmented Database Management Systems, CADBMSs, enhance the velocity of simple operations that read and write a small amount of data from big data. They are most suitable for those applications with workloads that exhibit a high read to write ratio, e.g., interactive social networking actions. This study surveys state of the art with CADBMSs and presents physical data independence as the next step in their evolution. We detail the requirements of this evolution, technological trends and software practices, and our research efforts in this area.
Cache augmented SQL, CASQL, systems enhance the performance of simple operations that read and write a small amount of data from big data. They do so by looking up the results of computations that query the database in a key-value store (KVS) instead of processing them using a relational database management system (RDBMS).These systems incur undesirable race conditions that cause the KVS to produce stale data. This paper presents the IQ framework that provides strong consistency with no modification to the RDBMS. It consists of two non-blocking leases, Inhibit (I) and Quarantine (Q). Ratings obtained from a social networking benchmark named BG show the proposed framework has minimal impact on system performance while providing strong consistency guarantees. A IntroductionOrganizations extend a relational database management system (RDBMS) with a key-value store (KVS) to enhance system performance for workloads consisting of simple operations that exhibit a high read to write ratio, e.g., interactive social networking actions. The key insight is that query result look up using the KVS is faster and more efficient than processing the same query using the RDBMS. A challenge of the resulting Cache Augmented SQL (CASQL) system is how to maintain the cached query results consistent in the presence of updates to the RDBMS.One approach is for the developer to provide software to either invalidate, refresh, or incrementally update the key-value pairs, see Figure 1. To describe these techniques, we define a session as a sequence consisting of an RDBMS operation followed by one or more KVS operations (or with the KVS operations being performed first). With invalidate, the application computes the key impacted by the SQL Data Manipulation Language (DML) commands 1 and deletes them from the KVS. With refresh, the application reads the impacted key-value pairs, updates them, and ¡
This paper compares the performance of an SQL solution that implements a relational data model with a document store named MongoDB. We report on the performance of a single node configuration of each data store and assume the database is small enough to fit in main memory. We analyze utilization of the CPU cores and the network bandwidth to compare the two data stores. Our key findings are as follows. First, for those social networking actions that read and write a small amount of data, the join operator of the SQL solution is not slower than the JSON representation of MongoDB. Second, with a mix of actions, the SQL solution provides either the same performance as MongoDB or outperforms it by 20%. Third, a middle-tier cache enhances the performance of both data stores as query result look up is significantly faster than query processing with either system. A IntroductionThere is an abundance of data stores with both the computer industry and the research arena contributing novel architectures and data models. In [10], Cattell surveys and classifies 22 data stores to motivate a quantitative analysis of the alternative designs and implementations. We study a specific aspect of this vast multi-faceted topic, namely, a comparison of an industrial strength relational database management system (RDBMS) named 1 SQL-X and a NoSQL document store named MongoDB. While SQL-X implements a relational data model [12], MongoDB implements a £ A shorter version of this paper appeared in the ACM International Conference on Information and Knowledge Management (CIKM), San Francisco, CA, Oct 2013.1 Due to licensing agreement, we cannot disclose the identity of this system. 1JSON representation of data [14]. Each offers a rich set of design choices. We use the BG [5] benchmark to exercise the different capabilities of each data store. This social networking benchmark consists of a database and eleven actions (see Table 1) that either read or write a small amount of data from the database.While SQL-X does not scale horizontally, MongoDB scales to a large number of nodes. In addition to impacting the performance of a single node instance of each data store, physical organization of data impacts the horizontal scalability of MongoDB. While both are important, we focus on the performance of a single node instance of each data store for the following reasons. First, it provides insights into the tradeoffs associated with two alternative logical data designs, namely, relational and JSON. An interesting finding is that the use of the join operator is not slower than the JSON representation, see Section D.Second, while BG's interactive social networking actions are simple, they interact in complex ways to offer a wide range of design choices. We show it is beneficial to move the work of read actions to write actions when the workload is dominated by read actions. (According to Facebook, more than 99% of their workload is dominated by queries [3,28].) Materialized views are not appropriate because they provide either a very low perfor...
Cost Adaptive Multi-queue eviction Policy (CAMP) is an algorithm for a general purpose key-value store (KVS) that manages key-value pairs computed by applications with different access patterns, key-value sizes, and varying costs for each key-value pair. CAMP is an approximation of the Greedy Dual Size (GDS) algorithm in that its eviction policy is as effective as GDS. At the same time, its implementation is as efficient at LRU. Similar to an implementation of LRU using queues, it adapts to changing workload patterns based on the history of requests for different key-value pairs. It is superior to LRU because it considers both the size and cost of key-value pairs to maximize the utility of the available memory across competing applications. We compare CAMP with both LRU and an alternative that requires human intervention to partition memory into pools and assign grouping of key-value pairs to different pools. The results demonstrate CAMP is as fast as LRU while outperforming both LRU and the pooled alternative. We also present results from an implementation of CAMP using Twitter's version of memcached.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.