Abstract. Greedy graph matching provides us with a fast way to coarsen a graph during graph partitioning. Direct algorithms on the CPU which perform such greedy matchings are simple and fast, but offer few handholds for parallelisation. To remedy this, we introduce a finegrained shared-memory parallel algorithm for maximal greedy matching, together with an implementation on the GPU, which is faster (speedups up to 6.8 for random matching and 5.6 for weighted matching) than the serial CPU algorithms and produces matchings of similar (random matching) or better (weighted matching) quality.
Agglomerative clustering is an effective greedy way to quickly generate graph clusterings of high modularity in a small amount of time. In an effort to use the power offered by multi-core CPU and GPU hardware to solve the clustering problem, we introduce a fine-grained sharedmemory parallel graph coarsening algorithm and use this to implement a parallel agglomerative clustering heuristic on both the CPU and the GPU. This heuristic is able to generate clusterings in very little time: a modularity 0.996 clustering is obtained from a street network graph with 14 million vertices and 17 million edges in 4.6 seconds on the GPU.
A hypergraph model for mapping applications with an all-neighbor communication pattern to distributed-memory computers is proposed, which originated in finite element triangulations. Rather than approximating the communication volume for linear algebra operations, this new model represents the communication volume exactly. To this end, a hypergraph partitioning problem is formulated where the objective function involves a new metric. This metric, the kðk À 1Þ-metric, accurately models the communication volume for an all-neighbor communication pattern occurring in a concrete finite element application. It is a member of a more general class of metrics, which also contains more widely used metrics, such as the cut-net and the ðk À 1Þ-metric. In addition, we develop a heuristic to minimize the communication volume in the new kðk À 1Þ-metric. For the solution of several real-world finite element problems, experimental results based on this new heuristic demonstrate a small reduction in communication volume compared to a standard graph partitioner and do not show significant reductions in communication volume compared to a hypergraph partitioner using the common ðk À 1Þ-metric. However, for this set of problems, the new approach does reduce actual communication times. As a by-product, we observe that it also tends to reduce the number of messages. Furthermore, the new approach dramatically reduces the communication volume for a set of sparse matrix problems that are more irregularly-structured than finite element problems.
We investigate using the Mondriaan matrix partitioner for unweighted graph partitioning in the communication volume and edge-cut metrics. By converting the unweighted graphs to appropriate matrices, we measure Mondriaan's performance as a graph partitioner for the 10th DIMACS challenge on graph partitioning and clustering. We find that Mondriaan can effectively be used as a graph partitioner: w.r.t. the edge-cut metric, Mondriaan's best results are on average within 13% of the best known results as listed in Chris Walshaw's partitioning archive, but it is an order of magnitude slower than dedicated graph partitioners.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.