Proceedings of the 1993 ACM/IEEE Conference on Supercomputing - Supercomputing '93 1993
DOI: 10.1145/169627.169691
|View full text |Cite
|
Sign up to set email alerts
|

Communication and computation performance of the CM-5

Abstract: The ThinkingMachines CM-5

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
6
0

Year Published

1995
1995
2000
2000

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 25 publications
(8 citation statements)
references
References 8 publications
2
6
0
Order By: Relevance
“…The same fast T3D communications are not found on the CM-5 system: transfer times and latency overheads are measured to be ten and 30 times worse than the respective shared memory T3D ones. The presented results are in agreement with [26] for the CM-5 (a worse bandwidth is found in [23], probably due to the old CMMD version), with [25] for the T3D, and with [20,22] for the SP2 (complete results for several MPP systems can be found in [25]). Communication performance, indeed, can be considered the true bottleneck for CM-5 applications (the improvement achieved by low-level message passing libraries on the CM-5 is shown in [23]).…”
Section: Latency and Bandwidthsupporting
confidence: 86%
See 1 more Smart Citation
“…The same fast T3D communications are not found on the CM-5 system: transfer times and latency overheads are measured to be ten and 30 times worse than the respective shared memory T3D ones. The presented results are in agreement with [26] for the CM-5 (a worse bandwidth is found in [23], probably due to the old CMMD version), with [25] for the T3D, and with [20,22] for the SP2 (complete results for several MPP systems can be found in [25]). Communication performance, indeed, can be considered the true bottleneck for CM-5 applications (the improvement achieved by low-level message passing libraries on the CM-5 is shown in [23]).…”
Section: Latency and Bandwidthsupporting
confidence: 86%
“…The maximum bidirectional bandwidths measured in the shift operation on the three MPP systems are reported in Table 2. As in the previous experiment, the T3D within the shared memory model achieves very high transfer rates, outperforming by a factor of three and ten, respectively, the SP2 with MPL (in agreement with [20]) and the CM-5 (in agreement with [26]). We want to stress that the best performance obtained on the T3D is twice the W measured in the ping-pong operation (almost 50 and 110 Mbyte/s of peak are reached with PVMfast and shared memory primitives).…”
Section: Global Shift Operationsupporting
confidence: 73%
“…This is because as the number of processors increases, the dataparallel operations supported by pipeline bits across the long wires are much less distance sensitive. The experiments in [9] show that message transmission latencies and bandwidths are independent of the partition size on the CM-5. The network latencies vary only slightly with the number of network levels crossed.…”
Section: Network Latency Patterns Versus Cm-5 Fat-tree System Scalingmentioning
confidence: 93%
“…Most of the literature deals with the CM-5 and focuses on raw network performance. 22 ' 27 ' 29 Typical communication patterns include simple sends and pingpong between pairs of nodes. Block permutations of data and grid shifts have been shown to have little or no contention on the CM-5.…”
Section: Introductionmentioning
confidence: 99%