Parallel Algorithms for Butterfly Computations

Shi, Jessica; Shun, Julian

doi:10.48550/arxiv.1907.08607

Cited by 4 publications

(29 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Several real-world systems naturally exhibit bipartite relationships, such as consumer-product purchase network of an e-commerce website [26], user-ratings data in a recommendation system [23,34], author-paper network of a scientific field [41], group memberships in a social network [37] etc. Due to the rapid growth of data produced in these domains, efficient mining of dense structures in bipartite graphs has become a popular research topic [30,51,54,66,67,72].…”

Section: Introductionmentioning

confidence: 99%

“…Butterfly (2, 2−biclique/quadrangle) is the smallest cohesive motif in bipartite graphs. Butterflies can be used to directly analyze bipartite graphs and have drawn significant research interest in the recent years [24,48,49,51,54,[64][65][66]. Sariyuce and Pinar [51] use butterflies as a density indicator to define the notion of 𝑘−wings and 𝑘−tips, as maximal bipartite subgraphs where each edge and vertex, respectively, is involved in at least 𝑘 butterflies.…”

Section: Introductionmentioning

confidence: 99%

“…Existing algorithms for decomposing bipartite graphs typically employ an iterative bottom-up peeling approach [51,54], wherein entities (edges and vertices for wing and tip decomposition, respectively) with the minimum support (butterfly count) are peeled in each iteration. Peeling an entity 𝑙 involves deleting it from the graph and updating the support of other entities that share butterflies with 𝑙.…”

Section: Introductionmentioning

confidence: 99%

“…Parallel computing is widely used to scale such high complexity analytics to large datasets [33,42,57]. However, the bottom-up peeling approach used in existing parallel frameworks [54] severely restricts parallelism by peeling entities in a strictly increasing order of their entity numbers (wing or tip numbers). Consequently, it takes a very large number of iterations to peel an entire graph, for example, it takes > 31 million iterations to peel all edges of the trackers dataset using bottom-up peeling.…”

Section: Introductionmentioning

confidence: 99%

“…Moreover, each peeling iteration is sequentially dependent on support updates in all prior iterations, thus mandating synchronization of parallel threads before each iteration. Hence, the conventional approach of parallelizing workload within each iteration [54] suffers from heavy thread synchronization, and poor parallel scalability.…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph Discovery

Lakhotia¹,

Kannan²,

Prasanna³

2021

Preprint

View full text Add to dashboard Cite

Motif-based graph decomposition is widely used to mine hierarchical dense structures in graphs. In bipartite graphs, wing and tip decomposition construct a hierarchy of butterfly (2,2-biclique) dense edge and vertex induced subgraphs, respectively. They have applications in several domains including e-commerce, recommendation systems and document analysis.Existing decomposition algorithms use a bottom-up approach that constructs the hierarchy in an increasing order of subgraph density. They iteratively select the entities (edges or vertices) with minimum support (butterfly count) and peel them i.e. remove them from them graph and update the support of other entities. The amount of butterflies in real-world bipartite graphs makes bottom-up peeling computationally demanding. Furthermore, the strict order of peeling entities results in a large number of iterations with sequential dependencies on preceding support updates. Consequently, parallel algorithms based on bottom up peeling can only utilize intra-iteration parallelism and require heavy synchronization, leading to poor scalability.In this paper, we propose a novel Parallel Bipartite Network peelinG (PBNG) framework which adopts a two-phased peeling approach to relax the order of peeling, and in turn, dramatically reduce synchronization. The first phase divides the decomposition hierarchy into few partitions, and requires little synchronization to compute such partitioning. The second phase concurrently processes all of these partitions to generate individual levels in the final decomposition hierarchy, and requires no global synchronization.Effectively, both phases of PBNG parallelize computation across multiple levels of decomposition hierarchy, which is not possible with bottom-up peeling. The two-phased peeling further enables batching optimizations that dramatically improve the computational efficiency of PBNG. The proposed approach represents a non-trivial generalization of our prior work on a two-phased vertex peeling algorithm [30], and its adoption for both tip and wing decomposition.We empirically evaluate PBNG using several real-world bipartite graphs and demonstrate radical improvements over the existing approaches. On a shared-memory 36 core server, PBNG achieves up to 19.7× self-relative parallel speedup. Compared to the state-of-theart parallel framework ParButterfly, PBNG reduces synchronization by up to 15260× and execution time by up to 295×. Furthermore, it achieves up to 38.5× speedup over state-of-the-art algorithms specifically tuned for wing decomposition. We also present the first decomposition results of some of the largest public real-world datasets, which PBNG can peel in few minutes/hours, but algorithms in current practice fail to process even in several days. Our source code is made available at https://github.com/kartiklakhotia/RECEIPT.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph Discovery

Lakhotia¹,

Kannan²,

Prasanna³

2021

Preprint

View full text Add to dashboard Cite

show abstract

Graphlet counting in massive networks

Sanei Mehri

View full text Add to dashboard Cite

show abstract

FLEET: Butterfly Estimation from a Bipartite Graph Stream

Sanei-Mehri,

Zhang,

Sariyuce

et al. 2018

Preprint

View full text Add to dashboard Cite

We consider space-efficient single-pass estimation of the number of butterflies, a fundamental bipartite graph motif, from a massive bipartite graph stream where each edge represents a connection between entities in two different partitions. We present a space lower bound for any streaming algorithm that can estimate the number of butterflies accurately, as well as FLEET, a suite of algorithms for accurately estimating the number of butterflies in the graph stream. Estimates returned by the algorithms come with provable guarantees on the approximation error, and experiments show good tradeoffs between the space used and the accuracy of approximation. We also present space-efficient algorithms for estimating the number of butterflies within a sliding window of the most recent elements in the stream. While there is a significant body of work on counting subgraphs such as triangles in a unipartite graph stream, our work seems to be one of the few to tackle the case of bipartite graph streams.

show abstract

Parallel Algorithms for Butterfly Computations

Cited by 4 publications

References 0 publications

Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph Discovery

Parallel Peeling of Bipartite Networks for Hierarchical Dense Subgraph Discovery

Graphlet counting in massive networks

FLEET: Butterfly Estimation from a Bipartite Graph Stream

Contact Info

Product

Resources

About