R-Kleene: A High-Performance Divide-and-Conquer Algorithm for the All-Pair Shortest Path for Densely Connected Networks

D'Alberto, Paolo; Nicolau, Alexandru

doi:10.1007/s00453-006-1224-z

Cited by 37 publications

(18 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In-place Parallel Recursive approach using kleene's algorithm [7] Kleene's algorithm is used for finding transitive closure that computes the path existence between every possible pair of vertices(i, j). Kleene's algorithm divides the nodes of the graph into n⁄√s zones.…”

Section: International Journal Of Computer Applications (0975 -8887) mentioning

confidence: 99%

Algorithms of All Pair Shortest Path Problem

Susmita¹,

Pandey²

2015

IJCA

View full text Add to dashboard Cite

This paper is based on survey of various algorithms for all pair shortest path problem (APSP) on arbitrary real weighted directed graphs.This paper has summarized existing methods for solving shortest-path problems. In particular, we have addressed both sequential and parallel algorithms. We begin with a review of conventional sequential shortest-path algorithms and later, we have discussed blocked and vectorized implementation, thereby with the aim of reducing computational effort.

show abstract

Section: International Journal Of Computer Applications (0975 -8887) mentioning

confidence: 99%

Algorithms of All Pair Shortest Path Problem

Susmita¹,

Pandey²

2015

IJCA

View full text Add to dashboard Cite

show abstract

“…A cache-oblivious algorithm for Floyd-Warshall's APSP algorithm is given in [27] (see also [13]). The algorithm runs in O(n 3 ) time and incurs O(…”

Section: Related Workmentioning

confidence: 99%

The Cache-Oblivious Gaussian Elimination Paradigm: Theoretical Framework, Parallelization and Experimental Evaluation

Chowdhury

Ramachandran

2010

Theory Comput Syst

View full text Add to dashboard Cite

We consider triply-nested loops of the type that occur in the standard Gaussian elimination algorithm, which we denote by GEP (or the Gaussian Elimination Paradigm). We present two related cache-oblivious methods I-GEP and C-GEP, both of which reduce the number of cache misses incurred (or I/Os performed) by the computation over that performed by standard GEP by a factor of √ M, where M is the size of the cache. Cache-oblivious I-GEP computes in-place and solves most of the known applications of GEP including Gaussian elimination and LU-decomposition without pivoting and Floyd-Warshall all-pairs shortest paths. Cache-oblivious C-GEP uses a modest amount of additional space, but is completely general and applies to any code in GEP form. Both I-GEP and C-GEP produce system-independent cacheefficient code, and are potentially applicable to being used by optimizing compilers for loop transformation.We present parallel I-GEP and C-GEP that achieve good speed-up and match the sequential caching performance cache-obliviously for both shared and distributed caches for sufficiently large inputs.We present extensive experimental results for both in-core and out-of-core performance of our algorithms. We consider both sequential and parallel implementations, and compare them with finely-tuned cache-aware BLAS code for matrix multiplication and Gaussian elimination without pivoting. Our results indicate that cacheoblivious GEP offers an attractive trade-off between efficiency and portability.This work was supported in part by NSF Grant CCF-0514876 and NSF CISE Research Infrastructure Grant EIA-0303609. This journal submission incorporates results on the cache-oblivious paradigm that were presented in preliminary form in [8] and [9].

show abstract

“…The correctness of the recursive algorithm has been formally proven in various ways before [21,22]. Here we present a simpler proof based on algebraic paths.…”

Section: Recursive In-place Apsp Algorithmmentioning

confidence: 99%

“…Recursive formulations of APSP have been presented by many researchers over the years [21,22,23]. The connection to semiring matrix multiplication was shown by Aho et al [12], but they did not present a complete algorithm.…”

Section: Recursive In-place Apsp Algorithmmentioning

confidence: 99%

“…The connection to semiring matrix multiplication was shown by Aho et al [12], but they did not present a complete algorithm. Ours is a modified version of the algorithm of Tiskin [23] and R-Kleene algorithm [21]. Especially, the in-place nature of the R-Kleene algorithm helped us avoid expensive global memory to global memory data copying.…”

Section: Recursive In-place Apsp Algorithmmentioning

confidence: 99%

See 1 more Smart Citation

Solving path problems on the GPU

2010

View full text Add to dashboard Cite

We consider the computation of shortest paths on Graphic Processing Units (GPUs). The blocked recursive elimination strategy we use is applicable to a class of algorithms (such as all-pairs shortest-paths, transitive closure, and LU decomposition without pivoting) having similar data access patterns. Using the all-pairs shortest-paths problem as an example, we uncover potential gains over this class of algorithms. The impressive computational power and memory bandwidth of the GPU make it an attractive platform to run such computationally intensive algorithms. Although improvements over CPU implementations have previously been achieved for those algorithms in terms of raw speed, the utilization of the underlying computational resources was quite low. We implemented a recursively partioned all-pairs shortest-paths algorithm that harnesses the power of GPUs better than existing implementations. The alternate schedule of path computations allowed us to cast almost all operations into matrix-matrix multiplications on a semiring. Since matrix-matrix multiplication is highly optimized and has a high ratio of computation to communication, our implementation does not suffer from the premature saturation of bandwidth resources as iterative algorithms do. By increasing temporal locality, our implementation runs more than two orders of magnitude faster on an NVIDIA 8800 GPU than on an Opteron. Our work provides evidence that programmers should rethink algorithms instead of directly porting them to GPU.

show abstract

R-Kleene: A High-Performance Divide-and-Conquer Algorithm for the All-Pair Shortest Path for Densely Connected Networks

Cited by 37 publications

References 34 publications

Algorithms of All Pair Shortest Path Problem

Algorithms of All Pair Shortest Path Problem

The Cache-Oblivious Gaussian Elimination Paradigm: Theoretical Framework, Parallelization and Experimental Evaluation

Solving path problems on the GPU

Contact Info

Product

Resources

About