Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud 2010
DOI: 10.1145/1779599.1779604
|View full text |Cite
|
Sign up to set email alerts
|

Towards scalable RDF graph analytics on MapReduce

Abstract: In order to exploit the growing amount of RDF data in decisionmaking, there is an increasing demand for analytics-style processing of such data. RDF data is modeled as a labeled graph that represents a collection of binary relations (triples). In this context, analytical queries can be interpreted as consisting of three main constructs namely pattern matching, grouping and aggregation, and require several join operations to reassemble them into n-ary relations relevant to the given query, unlike traditional OL… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0
1

Year Published

2011
2011
2020
2020

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 25 publications
(11 citation statements)
references
References 12 publications
0
10
0
1
Order By: Relevance
“…Our previous work, RAPID [13] focused on optimizing analytical processing of RDF data on Pig. RAPID+ [10] extended Pig with UDFs to enable TripleGroup-based processing. In this work, we provide formal semantics to integrate TripleGroups as first-class citizens, and present operators for graph pattern matching.…”
Section: Experiments Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our previous work, RAPID [13] focused on optimizing analytical processing of RDF data on Pig. RAPID+ [10] extended Pig with UDFs to enable TripleGroup-based processing. In this work, we provide formal semantics to integrate TripleGroups as first-class citizens, and present operators for graph pattern matching.…”
Section: Experiments Resultsmentioning
confidence: 99%
“…In our previous work [10], we proposed an approach to exploit star sub patterns by re-interpreting star-joins using a grouping-based join algorithm. It can be observed that performing a group by Subject yields groups of tuples or TripleGroups that represent all the star sub graphs in the database.…”
Section: Graph Pattern Matching In Apache Pigmentioning
confidence: 99%
“…PIGSparQL [19] performs a direct mapping of SPARQL to Pig without focusing on optimization. RAPID+ [17] provides a limited form of on-the-fly optimization where look-ahead processing tries to combine multiple subsequent join steps. The adaptiveness of this approach is however limited compared to our sampling based run-time query optimization.…”
Section: Related Workmentioning
confidence: 99%
“…Forecasts [14] suggest that cloud computing would grow at a very fast rate over the next few years. Cloud computing has been used for a variety of tasks such as parallel data mining [15] and graph analytics [16].…”
Section: Cloud Computing For Servicesmentioning
confidence: 99%