2017
DOI: 10.2139/ssrn.3199272
|View full text |Cite
|
Sign up to set email alerts
|

Decomposing Federated Queries in Presence of Replicated Fragments

Abstract: Federated query engines allow for linked data consumption using SPARQL endpoints. Replicating data fragments from different sources enables data re-organization and provides the basis for more effective and efficient federated query processing. However, existing federated query engines are not designed to support replication. In this paper, we propose a replication-aware framework named LILAC, sparqL query decomposItion against federations of repLicAted data sourCes, that relies on replicated fragment descript… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
16
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
5
1

Relationship

3
3

Authors

Journals

citations
Cited by 7 publications
(16 citation statements)
references
References 30 publications
0
16
0
Order By: Relevance
“…Recent examples of such research include the generation of a navigable Graph of Things from live Internet of Things data sources [50] and the use of crowdsourcing to provide real-time transport data in rural areas [51], both topics with parallels to how RIs gather and expose field observations acquired via sensors or human experts. On the topic of distributed query, various languages/frameworks have been proposed such as LDQL [52] and LILAC [53], which can make linked data based search over distributed catalogues more practical than is currently the case by better distributing queries across catalogue nodes with less redundancy and then joining the results efficiently. Such developments reduce the need to aggregate as much metadata in a joint catalogue, however the demands of search (particularly with regard to perceived responsiveness to queries by end-users) make it still generally necessary to cache key metadata in a central store.…”
Section: Linking With Semantic Webmentioning
confidence: 99%
“…Recent examples of such research include the generation of a navigable Graph of Things from live Internet of Things data sources [50] and the use of crowdsourcing to provide real-time transport data in rural areas [51], both topics with parallels to how RIs gather and expose field observations acquired via sensors or human experts. On the topic of distributed query, various languages/frameworks have been proposed such as LDQL [52] and LILAC [53], which can make linked data based search over distributed catalogues more practical than is currently the case by better distributing queries across catalogue nodes with less redundancy and then joining the results efficiently. Such developments reduce the need to aggregate as much metadata in a joint catalogue, however the demands of search (particularly with regard to perceived responsiveness to queries by end-users) make it still generally necessary to cache key metadata in a central store.…”
Section: Linking With Semantic Webmentioning
confidence: 99%
“…In order to query the remote RDF stores a global index is required that indicates which RDF stores contain data that are relevant for the query. This index is created by retrieving statistical information that are provided by the remote RDF stores (e.g., SPLENDID [48], WoDQA [8], LHD [134], DAW [120], SemaGrow [26], FEDRA [91], LILAC [92] and Odyssey [90]) or the user (e.g., DARQ [115]), by sending special queries to the remote RDF stores (e.g., FedX [127], ANAPSID [7,93], Lusail [86]) or by observing the results that are returned during the processing of user queries (e.g., ADERIS [83]). Also combinations of these strategies are possible as in Avalanche [18].…”
Section: Federated Rdf Storesmentioning
confidence: 99%
“…If a triple pattern with two constants is requested, the indices described so far could only restrict the number of queried compute nodes by either of the two constants. To restrict the number of queried compute nodes even further, LILAC [92], SemStore [138] additionally count how frequently all subject-property, property-object and subjectobject combinations occur.…”
Section: Centralized Indicesmentioning
confidence: 99%
“…Ulysses uses a replication-aware source selection algorithm to identify which TPF servers can be used to distribute evaluation of triple patterns during SPARQL query processing, based on the replication model introduced in [2,3].…”
Section: Replication-aware Source Selectionmentioning
confidence: 99%
“…Consider the SPARQL query Q 1 in Figure 1, and the two servers S 1 and S 2 publishing a replica of the DBpedia 2015 dataset, hosted by DBpedia 3 and LANL Linked Data Archive 4 , respectively. Executing Q 1 with the regular TPF client [4] on S 1 alone generates 442 HTTP calls, takes 7s in average, and returns 222 results.…”
Section: Introductionmentioning
confidence: 99%