Abstract. Considering the scalability and semantic requirements, Resource Description Framework (RDF) and the de-facto query language SPARQL are well suited for managing and querying online social network (OSN) data. Despite some existing works have introduced distributed framework for querying large-scale data, how to improve online query performance is still a challenging task. To address this problem, this paper proposes a scalable RDF data framework, which uses key-value store for offline RDF storage and pipelined inmemory based query strategy. The proposed framework efficiently supports SPARQL Basic Graph Pattern (BGP) queries on large-scale datasets. Experiments on the benchmark dataset demonstrate that the online SPARQL query performance of our framework outperforms existing distributed RDF solutions.
Keywords: RDF · SPARQL · Social networks · Query processing
IntroductionWith the rapid development of web social network applications such as Facebook, Twitter and Microblog, a large number of users linked data are generated. The characteristics of such data are large volume and complicated structure. So how to effectively manage OSN data is a hot topic in academic and industrial research. The scalability and flexibility of RDF, which is designed for Semantic Web can express BGP queries for RDF, which can be directly applied to the OSN subgraph query. In general, the nature of RDF model makes it suitable for large-scale complex OSN management. Figure 1 illustrates an example for a fraction of OSN graph representing relations between users and User Generated Contents. Query which finds pairs of users in a path of friend relationship which user1 likes the blog1 that created by user2 is expressed in SPARQL as:
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.