2016
DOI: 10.14778/2977797.2977806
|View full text |Cite
|
Sign up to set email alerts
|

S2rdf

Abstract: RDF has become very popular for semantic data publishing due to its flexible and universal graph-like data model. Thus, the ever-increasing size of RDF data collections raises the need for scalable distributed approaches. We endorse the usage of existing infrastructures for Big Data processing like Hadoop for this purpose. Yet, SPARQL query performance is a major challenge as Hadoop is not intentionally designed for RDF processing. Existing approaches often favor certain query pattern shapes while performance … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
24
0

Year Published

2017
2017
2018
2018

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 112 publications
(25 citation statements)
references
References 24 publications
1
24
0
Order By: Relevance
“…Each SPARQL query is decomposed into multiple subqueries, which are then evaluated independently. Since the data is [46] Subject Hash Distributed Semi-Join CliqueSquare [25] Hybrid (Hash + VP) MapReduce-based Join DREAM [38] No partitioning; full replication RDF-3X [53] EAGRE [56] METIS MapReduce-based Join gStoreD [45] Partitioning Agnostic gStore [37] H-RDF-3X [29] METIS RDF-3X [53] H2RDF+ [41] H-Base partitioner (range) Centralized + MapReduce HadoopRDF [30] VP + predicate files on HDFS MapReduce Join Partout [36] Workload-based fragmentation RDF-3X [53] PigSparql [14] Hash + Triple-based files SPARQL to PigLatin S2RDF [15] Extended Vertical Partitioning SPARQL to SQL S2X [51] GraphX partitioning strategy Vertex-Centric BGP matching Sedge [57] Subject Hash Vertex-Centric BGP matching Sempala [50] VP SPARQL to SQL SHAPE [32] Semantic Hash Partitioning RDF-3X [53] SHARD [47] Hash MapReduce-based Join TriAD [48] Hash-based Sharding Distributed Merge/Hash Joins TriAD-SG [48] METIS + Horizontal Sharding Distributed Merge/Hash Joins Trinity.RDF [33] Key-value store on graph Graph Exploration WARP [28] METIS on query workload RDF-3X [53] In this survey, we categorize distributed RDF management systems along 2 dimensions based on their execution model: (i) MapReduce and Graph-based systems: such systems rely on general purpose frameworks, i.e., Hadoop or Spark, that offer seamless data distribution and parallelization at the cost of flexibility. (ii) Specialized RDF systems: are built specifically for SPARQL query evaluation by utilizing custom physical layouts, native RDF indexing, efficient communication protocols and explicit replication.…”
Section: Distributed Rdf Systemsmentioning
confidence: 99%
See 4 more Smart Citations
“…Each SPARQL query is decomposed into multiple subqueries, which are then evaluated independently. Since the data is [46] Subject Hash Distributed Semi-Join CliqueSquare [25] Hybrid (Hash + VP) MapReduce-based Join DREAM [38] No partitioning; full replication RDF-3X [53] EAGRE [56] METIS MapReduce-based Join gStoreD [45] Partitioning Agnostic gStore [37] H-RDF-3X [29] METIS RDF-3X [53] H2RDF+ [41] H-Base partitioner (range) Centralized + MapReduce HadoopRDF [30] VP + predicate files on HDFS MapReduce Join Partout [36] Workload-based fragmentation RDF-3X [53] PigSparql [14] Hash + Triple-based files SPARQL to PigLatin S2RDF [15] Extended Vertical Partitioning SPARQL to SQL S2X [51] GraphX partitioning strategy Vertex-Centric BGP matching Sedge [57] Subject Hash Vertex-Centric BGP matching Sempala [50] VP SPARQL to SQL SHAPE [32] Semantic Hash Partitioning RDF-3X [53] SHARD [47] Hash MapReduce-based Join TriAD [48] Hash-based Sharding Distributed Merge/Hash Joins TriAD-SG [48] METIS + Horizontal Sharding Distributed Merge/Hash Joins Trinity.RDF [33] Key-value store on graph Graph Exploration WARP [28] METIS on query workload RDF-3X [53] In this survey, we categorize distributed RDF management systems along 2 dimensions based on their execution model: (i) MapReduce and Graph-based systems: such systems rely on general purpose frameworks, i.e., Hadoop or Spark, that offer seamless data distribution and parallelization at the cost of flexibility. (ii) Specialized RDF systems: are built specifically for SPARQL query evaluation by utilizing custom physical layouts, native RDF indexing, efficient communication protocols and explicit replication.…”
Section: Distributed Rdf Systemsmentioning
confidence: 99%
“…S2RDF [15] is a SPARQL engine built on top of Spark [39]. It proposes a relational partitioning technique for RDF data called Extended Vertical partitioning (ExtVP).…”
Section: Sophisticated Partitioningmentioning
confidence: 99%
See 3 more Smart Citations