Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SP
DOI: 10.1109/ipps.1999.760431
|View full text |Cite
|
Sign up to set email alerts
|

Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system

Abstract: The known fast sequential algorithms for multiplying two N N matrices (over an arbitrary ring) have time complexity O(N), where 2 < < 3. The current best value of is less than 2.3755. We show that for all 1 p N , m ultiplying two N N matrices can be performed on a p-processor linear array with a recon gurable pipelined bus system (LARPBS) in O N p + N 2 p 2= log p time. This is currently the fastest parallelization of the best known sequential matrix multiplication algorithm on a distributed memory parallel sy… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(7 citation statements)
references
References 39 publications
0
7
0
Order By: Relevance
“…Specially a systematic design of cylindrical array for matrix multiplication have been presented. Keqin Li et al presented a brief material on parallel matrix multiplication with a reconfigurable pipelined bus system [9]. They claimed that for multiplication of 2 matrices of order can be done in time.…”
Section: Recent Techniques For Matrix Multiplicationmentioning
confidence: 99%
“…Specially a systematic design of cylindrical array for matrix multiplication have been presented. Keqin Li et al presented a brief material on parallel matrix multiplication with a reconfigurable pipelined bus system [9]. They claimed that for multiplication of 2 matrices of order can be done in time.…”
Section: Recent Techniques For Matrix Multiplicationmentioning
confidence: 99%
“…A similar thread of work, although in a different context, deals with reconfigurable architectures, either pipelined bus systems [12], or FPGAs [18]. In the latter approach, tradeoffs must be found to optimize the size of the on-chip memory and the available memory bandwidth, leading to partitioned algorithms that re-use data intensively.…”
Section: Related Workmentioning
confidence: 99%
“…It is now feasible to build distributed memory systems that are no less powerful and flexible than shared memory systems in solving many problems, such as Boolean matrix multiplication [17] and sorting [24]. Numerous parallel algorithms using optical interconnection networks have been developed recently [1,14,17,18,20,22,23,24,28,29,31,36,38].…”
Section: Distributed Memory Parallel Computersmentioning
confidence: 99%