2019
DOI: 10.14778/3358701.3358706
|View full text |Cite
|
Sign up to set email alerts
|

Distributed edge partitioning for trillion-edge graphs

Abstract: We propose Distributed Neighbor Expansion (Distributed NE), a parallel and distributed graph partitioning method that can scale to trillion-edge graphs while providing high partitioning quality. Distributed NE is based on a new heuristic, called parallel expansion, where each partition is constructed in parallel by greedily expanding its edge set from a single vertex in such a way that the increase of the vertex cuts becomes local minimal. We theoretically prove that the proposed method has the upper bound in … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
29
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 43 publications
(29 citation statements)
references
References 42 publications
(43 reference statements)
0
29
0
Order By: Relevance
“…NE selects edge gradually to fully fill each partition and this approach performs quite well. M. Hanai et al [14] proposed a follow-up study. They reformed NE to distribute its approach, this proposition can process trillion-edge graphs and achieves better performance in terms of running time.…”
Section: B Edge Partitioningmentioning
confidence: 99%
“…NE selects edge gradually to fully fill each partition and this approach performs quite well. M. Hanai et al [14] proposed a follow-up study. They reformed NE to distribute its approach, this proposition can process trillion-edge graphs and achieves better performance in terms of running time.…”
Section: B Edge Partitioningmentioning
confidence: 99%
“…The communication cost stems from the vertices/edges spanning computing nodes to ensure the synchronization among all computing nodes. Unfortunately, the graph partition problem with these two constraints is proved to be an NP-hard problem [10], so it is often solved by heuristic methods.…”
Section: Related Workmentioning
confidence: 99%
“…SWR [23] resorts the edges in the sliding window to move the edges with low-degree vertices upfront so these edges are less likely to span different computing nodes. Distributed NE [10] selects initial multiple random vertices and then greedily expands each edge set in parallel such that the increase of the vertex cuts becomes minimal, which can allocate most edges in a locally optimal way and seldom uses the random allocation. The locality of real-world graphs also implies many adjacent lists share a lot of common out-neighbors, which is named by target vertices in TSH [24].…”
Section: Related Workmentioning
confidence: 99%
“…Existing partitioning algorithms can be divided into two categories: In-memory algorithms [30,44,55,66] and streaming algorithms [28,32,47,51,64]. In-memory algorithms load the complete graph into memory, and, hence, have full flexibility to assign any edge to any partition at any time.…”
Section: Introductionmentioning
confidence: 99%
“…Streaming algorithms consume little memory, but even though they have been improved by sophisticated techniques such as window-based streaming [47] and multi-pass streaming [48], they do not yield the same partitioning quality on all graphs as the best in-memory algorithms. In current graph partitioning systems, the user has to decide for one of the two options, and then either provide a very large machine (or a cluster of machines) and get good partitioning quality [30,44,55,66] or a small machine and get worse partitioning quality [28,32,47,51,64].…”
Section: Introductionmentioning
confidence: 99%