Comparison of scalable parallel matrix multiplication libraries

Huss–Lederman, Steven; Jacobson, Elaine; Tsao, Anna

doi:10.1109/splc.1993.365573

Cited by 17 publications

(16 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, it is competitive, or faster, and, given its simplicity and flexibility, warrants consideration. Moreover, the implementations by Huss-Lederman et al [7,8] are competitive with PUMMA, and would thus compare similarly with SUMMA. Also, our method is presented in a slightly simplified setting and thus the performance may be slightly better than it would be if we implemented exactly for the cases for which PUMMA and the algorithm by Huss-Lederman et al were designed.…”

Section: Resultsmentioning

confidence: 99%

“…Two recent efforts extend the work by Fox et al to general meshes of nodes: the paper by Choi et al [6] uses a two-dimensional block-wrapped (block-cyclic) data decomposition, while the papers by Huss-Lederman et al [7,8] use a 'virtual' 2-D torus wrap data layout. Both these efforts report very good performance attained on the Intel Touchstone Delta, achieving a sizeable percentage of peak performance.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

SUMMA: scalable universal matrix multiplication algorithm

Geijn

Watts

1997

Concurrency: Pract. Exper.

367

255

View full text Add to dashboard Cite

show abstract

Section: Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

SUMMA: scalable universal matrix multiplication algorithm

Geijn

Watts

1997

Concurrency: Pract. Exper.

367

255

View full text Add to dashboard Cite

show abstract

“…We discuss a permutation compatible data distribution (i.e. virtual 2D torus wrap data distribution [21,7]), which is used to distribute matrices on two-dimensional process grid topologies. Finally, we introduce a modified virtual 2D data distribution that can solve the potential load imbalance problem induced by the virtual 2D torus wrap data distribution.…”

Section: Permutation Compatible Data Distributionsmentioning

confidence: 99%

“…For a non-square grid G P×Q , we can view it as a α × α virtual grid [21,7], where α is the least common multiple of P and Q. Then we distribute matrices on this α × α virtual grid [7].…”

Section: The Virtual 2-dimensional Gridmentioning

confidence: 99%