Keywords graph queries, relational algebra, query optimization IntroductionThe key components of Big Data are often defined as variety, velocity and volume [28] of data. Applications operating on continuously changing graphs are a prime example: the semi-structured graph-like nature introduces a high variety, changes happen at high velocity, and datasets are often high-volume. Such applications include fraud detection in financial transactions [27], validation of engineering models [3], and static analysis of source code repositories [35]. These use cases provide a set of complex queries that need to be evaluated continuously on each change of the underlying graph.Traditional approaches need to reevaluate each query upon each change, which often takes minutes on a large dataset. In contrast, incremental query evaluation caches interim results, hence it only requires reevaluation on a small fragment of the dataset impacted by the change. This leads to significant speedup for large and continuously changing data. Although several approaches exist for incremental query evaluation [9,20] in the context of expert systems, incremental query evaluation is not in widespread use in graph databases.In order to predict query performance at runtime, relational databases synthesize and evaluate different query plans which impose a certain ordering on relational algebraic operations prescribed by the query. Optimizing query plans is a challenging task, since a wide variety of query plans may exist even for simple queries with different costs. Database engines use heuristics-based optimization techniques and evaluate a cost function for the different query plans [10].Query plans have been adapted for graph query engines using a local-search based query evaluation strategy where it is called the search plan. Optimization techniques may exploit the type and multiplicity information defined in the graph schema (or metamodel) [29,22] or rely upon runtime statistics of the instance graph [11,38,39].In case of incremental graph query engines, the structure and the content of caches have the most significant impact on query performance. Therefore, optimization is directed to reduce execution time and memory consumption imposed by a complex network of caches [37].
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.