2016
DOI: 10.14778/3021924.3021937
|View full text |Cite
|
Sign up to set email alerts
|

Scalable distributed subgraph enumeration

Abstract: Subgraph enumeration aims to find all the subgraphs of a large data graph that are isomorphic to a given pattern graph. As the subgraph isomorphism operation is computationally intensive, researchers have recently focused on solving this problem in distributed environments, such as MapReduce and Pregel. Among them, the state-of-the-art algorithm, Twin TwigJoin, is proven to be instance optimal based on a left-deep join framework. However, it is still not scalable to large graphs because of the constraints in t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
97
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3
1

Relationship

2
6

Authors

Journals

citations
Cited by 86 publications
(98 citation statements)
references
References 24 publications
1
97
0
Order By: Relevance
“…If E is still true after the local verification, we add (u, v) to f (Line 13). Then we create a new trie node N for v with N as its parentN (Line 14,15). After that, if f grows to an EC of Pi , then for each undetermined edge e of f (both end vertices are not in the local machine), we add N to I[e] (Line 17, 18).…”
Section: Algorithm 1: Expandembedtriementioning
confidence: 99%
See 2 more Smart Citations
“…If E is still true after the local verification, we add (u, v) to f (Line 13). Then we create a new trie node N for v with N as its parentN (Line 14,15). After that, if f grows to an EC of Pi , then for each undetermined edge e of f (both end vertices are not in the local machine), we add N to I[e] (Line 17, 18).…”
Section: Algorithm 1: Expandembedtriementioning
confidence: 99%
“…Since join-based methods need to group the intermediate results based on keys so as to join them together, the performance was significantly dragged down when dealing with sparse graphs compared with RADS and PSgL. It is worth noting that PSgL was verified slower than TwinTwig and SEED in [13] [15]. This may be because the datasets used in TwinTwig and SEED are much denser than RoadNet, hence a huge number of embeddings will be generated.…”
Section: Exp-1:roadnetmentioning
confidence: 99%
See 1 more Smart Citation
“…In doing so, we also achieve two things. First, we compare our work to the recent SEED [36] work, which develops efficient optimizations for evaluating undirected subgraph queries in the distributed setting. Second, by implementing one of the optimizations, we demonstrate that our approach can take as input general relations instead of the binary edge(ai, aj) relations we used so far.…”
Section: Generality and Specializationsmentioning
confidence: 99%
“…Below we refer to some interesting representative examples. Methods, such as TwinTwig [22], sTwig [6] and SEED [23] deal with a single, very large graph, stored in a distributed infrastructure, and rely on parallel computing algorithms and infrastructures to perform the sub-iso testing. Methods, like iGQ [24] and GraphCache [25], employ caching on top of any proposed FTV method to improve performance and study the architecture, system and algorithms for a graph cache for subgraph queries for FTV and SI methods.…”
Section: Background a Related Workmentioning
confidence: 99%