Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?

Montoya, Gabriela; Vidal, María-Esther; Corcho, Óscar; Ruckhaus, Edna; Buil-Aranda, Carlos

doi:10.1007/978-3-642-35173-0_21

“…As a query-mix, we selected the Linked Data (LD), Life Science (LS), and Cross Domain (CD) queries from FedBench, appended with the complex queries (C) by Montoya et al [65]. The complete query-mix was ran 20 times in sequence on the public Web, accessed from a desktop computer in Belgium in order to represent realistic long-distance latency.…”

Section: Methodsmentioning

confidence: 99%

Triple Pattern Fragments: A low-cost knowledge graph interface for the Web

Verborgh

¹

,

Sande

²

,

Hartig

³

et al. 2016

Journal of Web Semantics

View full text Add to dashboard Cite

Billions of Linked Data triples exist in thousands of RDF knowledge graphs on the Web, but few of those graphs can be queried live from Web applications. Only a limited number of knowledge graphs are available in a queryable interface, and existing interfaces can be expensive to host at high availability. To mitigate this shortage of live queryable Linked Data, we designed a low-cost Triple Pattern Fragments interface for servers, and a client-side algorithm that evaluates SPARQL queries against this interface. This article describes the Linked Data Fragments framework to analyze Web interfaces to Linked Data and uses this framework as a basis to define Triple Pattern Fragments. We describe client-side querying for single knowledge graphs and federations thereof. Our evaluation verifies that this technique reduces server load and increases caching effectiveness, which leads to lower costs to maintain high server availability. These benefits come at the expense of increased bandwidth and slower, but more stable query execution times. These results substantiate the claim that lightweight interfaces can lower the cost for knowledge publishers compared to more expressive endpoints, while enabling applications to query the publishers' data with the necessary reliability.

show abstract

“…These limitations make it difficult to extrapolate how SPARQL query federation engines will perform when faced with the growing amount of 500 data available on the Data Web based on FedBench results. A fine-grained evaluation of the federation engines to detect the components that need to be improved is also not possible [14].…”

mentioning

confidence: 99%

LargeRDFBench: A Billion Triples Benchmark for SPARQL Endpoint Federation

Saleem¹,

Hasnain

²

,

Ngomo³

2018

View full text Add to dashboard Cite

Gathering information from the distributed Web of Data is commonly carried out by using SPARQL query federation approaches. However, the fitness of current SPARQL query federation approaches for real applications is difficult to evaluate with current benchmarks as they are either synthetic, too small in size and complexity or do not provide means for a fine-grained evaluation. We propose LargeRDFBench, a billion-triple benchmark for SPARQL query federation which encompasses real data as well as real queries pertaining to real bio-medical use cases. We evaluate state-of-the-art SPARQL endpoint federation approaches on this benchmark with respect to their query runtime, triple pattern-wise source selection, number of endpoints requests, and result completeness and correctness. Our evaluation results suggest that the performance of current SPARQL query federation systems on simple queries (in terms of total triple patterns, query result set sizes, execution time, use of SPARQL features etc.) does not reflect the systems' performance on more complex queries. Moreover, current federation systems seem unable to deal with real queries that involve processing large intermediate result sets or lead to large result sets.

show abstract

“…The client can execute the same (regular) SPARQL queries as in the single-server scenario, i.e., queries without the SERVICE keyword. As a query-mix, we selected the Linked Data (LD), Life Science (LS), and Cross Domain (CD) queries from FedBench, appended with the complex queries (C) by Montoya et al [65]. The complete query-mix was ran 20 times in sequence on the public Web, accessed from a desktop computer in Belgium in order to represent realistic long-distance latency.…”

Section: Methodsmentioning

confidence: 99%

Triple Pattern Fragments: A Low-Cost Knowledge Graph Interface for the Web

Verborgh

¹

,

Sande

²

,

Hartig

³

et al. 2016

View full text Add to dashboard Cite

Billions of Linked Data triples exist in thousands of RDF knowledge graphs on the Web, but few of those graphs can be queried live from Web applications. Only a limited number of knowledge graphs are available in a queryable interface, and existing interfaces can be expensive to host at high availability. To mitigate this shortage of live queryable Linked Data, we designed a low-cost Triple Pattern Fragments interface for servers, and a client-side algorithm that evaluates SPARQL queries against this interface. This article describes the Linked Data Fragments framework to analyze Web interfaces to Linked Data and uses this framework as a basis to define Triple Pattern Fragments. We describe client-side querying for single knowledge graphs and federations thereof. Our evaluation verifies that this technique reduces server load and increases caching effectiveness, which leads to lower costs to maintain high server availability. These benefits come at the expense of increased bandwidth and slower, but more stable query execution times. These results substantiate the claim that lightweight interfaces can lower the cost for knowledge publishers compared to more expressive endpoints, while enabling applications to query the publishers' data with the necessary reliability.

show abstract

Benchmarking Federated SPARQL Query Engines: Are Existing Testbeds Enough?

Cited by 33 publications

References 8 publications

Triple Pattern Fragments: A low-cost knowledge graph interface for the Web

Triple Pattern Fragments: A low-cost knowledge graph interface for the Web

LargeRDFBench: A Billion Triples Benchmark for SPARQL Endpoint Federation

Triple Pattern Fragments: A Low-Cost Knowledge Graph Interface for the Web

Contact Info

Product

Resources

About