We describe two hypercube algorithms to find the biconnected components (i.e., blocks) of a connected undirected graph. One is a modified version of the Tarjan-Vishkin algorithm. The two hypercube algorithms were experimentally evaluated on an NCUBE/7 MIMD hypercube computer. The two algorithms have comparable performance and efficiencies as high as 0.7 were observed.
Keywords and phrasesHypercube computing, MIMD computer, parallel programming, biconnected components __________________ * This research was supported in part by the National Science Foundation under grants DCR84-20935 and MIP 86-17374 1 2
INTRODUCTIONIn this paper we develop two biconnected component (i.e., block) algorithms suitable for medium grained MIMD hypercubes. The first algorithm is an adaptation of the algorithm of Tarjan and Vishkin [TARJ85]. Tarjan and Vishkin provide parallel CRCW and CREW PRAM implementations of their algorithm. The CRCW PRAM implementation of Tarjan and Vishkin runs in O(logn) time and uses O(n +m) processors. Here n and m are, respectively, the number of vertices and edges in the input connected graph. The CREW PRAM implementation runs in O(log 2 n) time using O(n 2 /log 2 n) processors.A PRAM algorithm that use p processors and t time can be simulated by a p processor hypercube in O(tlog 2 p) time using the random access read and write algorithms of Nassimi and Sahni [NASS81]. The CREW PRAM algorithm of [TARJ85] therefore results in an O(log 3 n) time O(n +m) processor hypercube algorithm. The CREW PRAM algorithm results in an O(log 4 n) time O(n 2 /log 2 n) processor hypercube algorithm. Using the results of Dekel, Nassimi, and Sahni [DEKE81] the biconnected components can be found in O(log 2 n) time using O(n 3 /logn) processors.The processor-time product of a parallel algorithm is a measure of the total amount of work done by the algorithm. For the three hypercube algorithms just mentioned, the processor-time product is, respectively, O(n 2 log 3 n) (assuming m ∼ ∼ O(n 2 )), O(n 2 log 2 n), and O(n 3 logn). In each case the processor-time product is larger than that for the single processor biconnected components algorithm (O(n 2 ) when m= O(n 2 )). As a result of this, we do not expect any of the above three hypercube algorithms to outperform the single processor algorithm unless the number of available processors, p, is sufficiently large. For example, if n = 1024, then the CRCW simulation on a hypercube does O(log 3 n) ∼ ∼1000 times more work than the uniprocessor algorithm. So we will need approximately 1000 processors just to break even.In fact, the processor-time product for many of the asymptotically fastest parallel hypercube algorithms exceeds that of the fastest uniprocessor algorithm by at least a multiplicative factor of log k n for some k, k≥1. As a result, the simulation of these algorithms on commercially available hypercubes with a limited number of processors does not yield good results. Consequently, there is often a wide disparity between the asymptotic algorithms developed for PRAM...