Van-Quyet NGUYEN †a) , Nonmember and Kyungbaek KIM †b) , Member
SUMMARYA widely-used query on a graph is a regular path query (RPQ) whose answer is a set of tuples of nodes connected by paths corresponding to a given regular expression. Traditionally, evaluating an RPQ on a large graph takes substantial memory spaces and long response time. Recently, several studies have focused on improving response time for evaluating an RPQ by splitting an original RPQ into smaller subqueries, evaluating them in parallel and combining partial answers. In these works, how to choose split labels in an RPQ is one of key points of the performance of RPQ evaluation, and rare labels of a graph can be used as split labels. However there is still a room for improvement, because a rare label cannot guarantee the minimum evaluation cost all the time. In this paper, we propose a novel approach of selecting split labels by estimating evaluation cost of each split subquery with a unit-subquery cost matrix (USCM), which can be obtained from a graph in prior to evaluate an RPQ. USCM presents the evaluation cost of a unit-subquery which is the smallest possible subquery, and we can estimate the evaluation cost of an RPQ by decomposing into a set of unit-subqueries. Experimental results show that our proposed approach outperforms rare label based approaches.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.