Yongming Luo scite author profile

As an essential part of the W3C's semantic web stack and linked data initiative, RDF data management systems (also known as triplestores) have drawn a lot of research attention. The majority of these systems use value-based indexes (e.g., B + -trees) for physical storage, and ignore many of the structural aspects present in RDF graphs. Structural indexes, on the other hand, have been successfully applied in XML and semi-structured data management to exploit structural graph information in query processing. In those settings, a structural index groups nodes in a graph based on some equivalence criterion, for example, indistinguishability with respect to some query workload (usually XPath). Motivated by this body of work, we have started the SAINT-DB project to study and develop a native RDF management system based on structural indexes. In this paper we present a principled framework for designing and using RDF structural indexes for practical fragments of SPARQL, based on recent formal structural characterizations of these fragments. We then explain how structural indexes can be incorporated in a typical query processing workflow; and discuss the design, implementation, and initial empirical evaluation of our approach. 1

show abstract

Storing and Indexing Massive RDF Datasets

Luo

Picalausa

Fletcher

et al. 2012

View full text Add to dashboard Cite

In this chapter we present a general survey of the current state of the art in RDF storage and indexing. In the flurry of research on RDF data management in the last decade, we can identify three different perspectives on RDF: (1) a relational perspective; (2) an entity perspective; and (3) a graph-based perspective. Each of these three perspectives has drawn from ideas and results in three distinct research communities to propose solutions for managing RDF data: relational databases (for the relational perspective); information retrieval (for the entity perspective); and graph theory and graph databases (for the graph-based perspective). Our goal in this chapter is to give an up-to-date overview of represpentative solutions within each perspective.

show abstract

Bisimulation Reduction of Big Graphs on MapReduce

Luo

Lange

Fletcher

et al. 2013

View full text Add to dashboard Cite

External memory K-bisimulation reduction of big graphs

Luo

Fletcher

Hidders

et al. 2013

View full text Add to dashboard Cite

In this paper, we present, to our knowledge, the first known I/O efficient solutions for computing the k-bisimulation partition of a massive directed graph, and performing maintenance of such a partition upon updates to the underlying graph. Ubiquitous in the theory and application of graph data, bisimulation is a robust notion of node equivalence which intuitively groups together nodes in a graph which share fundamental structural features. kbisimulation is the standard variant of bisimulation where the topological features of nodes are only considered within a local neighborhood of radius k 0.The I/O cost of our partition construction algorithm is bounded by O(k · sort(|Et|) + k · scan(|Nt|) + sort(|Nt|)), while our maintenance algorithms are bounded by O(k · sort(|Et|) + k · sort(|Nt|)). The space complexity bounds are O(|Nt| + |Et|) and O(k · |Nt| + k · |Et|), resp. Here, |Et| and |Nt| are the number of disk pages occupied by the input graph's edge set and node set, resp., and sort(n) and scan(n) are the cost of sorting and scanning, resp., a file occupying n pages in external memory. Empirical analysis on a variety of massive real-world and synthetic graph datasets shows that our algorithms perform efficiently in practice, scaling gracefully as graphs grow in size.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Yongming Luo

Efficient and scalable trie-based algorithms for computing set containment relations

A Structural Approach to Indexing Triples

Storing and Indexing Massive RDF Datasets

Bisimulation Reduction of Big Graphs on MapReduce

External memory K-bisimulation reduction of big graphs

Contact Info

Product

Resources

About