Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. However, due to the novelty of such systems, no standard data model or query language has yet emerged. Consequently, migrating datasets or applications even between related technologies often requires a large amount of manual work or ad-hoc solutions, thus subjecting the users to the possibility of vendor lock-in. To avoid this threat, vendors are working on supporting existing standard languages (e.g. SQL) or standardising languages. In this paper, we present a formal specification for openCypher, a highlevel declarative graph query language with an ongoing standardisation effort. We introduce relational graph algebra, which extends relational operators by adapting graph-specific operators and define a mapping from core openCypher constructs to this algebra. We propose an algorithm that allows systematic compilation of openCypher queries.
In this short paper, we provide an early look at the LDBC Social Network Benchmark's Business Intelligence (BI) workload which tests graph data management systems on a graph business analytics workload. Its queries involve complex aggregations and navigations (joins) that touch large data volumes, which is typical in BI workloads, yet they depend heavily on graph functionality such as connectivity tests and path finding. We outline the motivation for this new benchmark, which we derived from many interactions with the graph database industry and its users, and situate it in a scenario of social network analysis. The workload was designed by taking into account technical "chokepoints" identified by database system architects from academia and industry, which we also describe and map to the queries. We present reference implementations in openCypher, PGQL, SPARQL, and SQL, and preliminary results of SNB BI on a number of graph data management systems.
Abstract. Graph database systems are increasingly adapted for storing and processing heterogeneous network-like datasets. Many challenging applications with near real-time requirements-such as financial fraud detection, on-the-fly model validation and root cause analysis-can be formalised as graph problems and tackled with graph databases efficiently. However, as no standard graph query language has yet emerged, users are subjected to the possibility of vendor lock-in. The openCypher group aims to define an open specification for a declarative graph query language. However, creating an openCypher-compatible query engine requires significant research and engineering efforts. Meanwhile, model-driven language workbenches support the creation of domainspecific languages by providing high-level tools to create parsers, editors and compilers. In this paper, we present an approach to build a compiler and optimizer for openCypher using model-driven technologies, which allows developers to define declarative optimization rules.
Verifying ACID compliance is an essential part of database benchmarking, because the integrity of performance results can be undermined as the performance benefits of operating with weaker safety guarantees (at the potential cost of correctness) are well known. Traditionally, benchmarks have specified a number of tests to validate ACID compliance. However, these tests have been formulated in the context of relational database systems and SQL, whereas our scope of benchmarking are systems for graph data, many of which are non-relational. This paper presents a set of data model-agnostic ACID compliance tests for the LDBC (Linked Data Benchmark Council) Social Network Benchmark suite's Interactive (SNB-I) workload, a transaction processing benchmark for graph databases. We test all ACID properties with a particular emphasis on isolation, covering 10 transaction anomalies in total. We present results from implementing the test suite on 5 database systems.(LDBC) [3]. In particular, the LDBC's Social Network Benchmark Interactive workload (SNB-I) was designed to target transactional graph databases [10]. To provide protection against violations of correctness arising from the concurrent execution of transactions and system failures, such transactional databases provide Atomicity, Consistency, Isolation, and Durability (ACID) guarantees. Problem. Verifying ACID compliance is an important step in the benchmarking process for enabling fair comparison between systems. The performance benefits of operating with weaker safety guarantees are well established [13] but this can come at the cost of application correctness. To enable apples vs. apples performance comparisons between systems it is expected they uphold the ACID properties. Currently, LDBC provides no mechanism for validating ACID compliance within the SNB-I workflow. A simple solution would be to outsource the responsibility of demonstrating ACID compliance to benchmark implementors. However, the safety properties claimed by a system often do not match observable behaviour [14].To mitigate this problem, benchmarks such as TPC-C [20] include a number of ACID tests to be executed as part of the benchmarking auditing process. However, we found these tests cannot readily be applied to our context, as they assume lock-based concurrency control and an interactive query API that provides clients with explicit control over a transaction's lifecyle. Modern data systems often use optimistic concurrency control mechanisms [17] and offer a restricted query API, such as only executing transactions as stored procedures [19]. Further, tests that trigger and test row-level locking phenomena, for instance, do not readily map on graph database systems. Lastly, we found these tests are limited in the range of isolation anomalies they cover. Contribution. This paper presents the design of an implementation agnostic ACID compliance test suite for LDBC SNB-I 7 . Our guiding design principle was to be agnostic of system-level implementation details, relying solely on client observations ...
In this paper we present a database architecture and an application area and method in detail. As sensory data stream in the database it is efficient to preprocess them in-memory before integrating in the repository in order to save storage I/O cost. After data are integrated, it is important to allow efficient quering based on data retrieval profile. This can also be supported by the presented database architecture by partitioning the database upon different criteria. It is mandatory to hide the internal partitioned architectural details from higher layers, so options to allow transparent quering options are also presented. We have implemented a test system and experimental results are also given.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.