Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis 2011
DOI: 10.1145/2063384.2063440
|View full text |Cite
|
Sign up to set email alerts
|

An early performance analysis of POWER7-IH HPC systems

Abstract: In this work we present a performance evaluation of the POWER7-IH processor and of integrated systems built from it. We describe the architecture of P7-IH with an emphasis on those characteristics that have a direct impact on the performance for large-scale HPC systems and applications. An important area of emphasis is the memory and communication subsystems and their impact on achievable application performance. The results from a set of micro-benchmarks are presented that include memory, communication and OS… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2013
2013
2016
2016

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(11 citation statements)
references
References 18 publications
0
11
0
Order By: Relevance
“…The Hub Chip has seven different links for connecting nodes on the same drawer. These links have unidirectional bandwidth of 3 GB/s point-to-point between cores with a maximum of 24 GB/s aggregated unidirectional bandwidth [40]. In the streaming benchmark only one of the threads in the node communicates with a neighbouring node and therefore only one of the Hub links is used, decreasing the maximum bandwidth that is available.…”
Section: Microbenchmark Performancementioning
confidence: 99%
“…The Hub Chip has seven different links for connecting nodes on the same drawer. These links have unidirectional bandwidth of 3 GB/s point-to-point between cores with a maximum of 24 GB/s aggregated unidirectional bandwidth [40]. In the streaming benchmark only one of the threads in the node communicates with a neighbouring node and therefore only one of the Hub links is used, decreasing the maximum bandwidth that is available.…”
Section: Microbenchmark Performancementioning
confidence: 99%
“…We used a custom designed software stack that more tightly integrates communication with the application logic; such stacks need to be further explored. 8 When evaluating all-to-all communications, we noticed that naively communicating between all pairs of processes creates hotspots. Others have argued for support for randomized routing in hardware for reducing hotspots [18].…”
Section: Large-scale Observationsmentioning
confidence: 99%
“…To the best of our knowledge, this approach to alleviating hotspots is novel. 8 We have also used such an integrated stack in the design of our Graph500 implementation [1], a non-trivial workload.…”
Section: Large-scale Observationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Our approach takes inspiration from prior work on highlevel performance analysis and modeling [3,[26][27][28]44], as well as the classical theory of circuit models and the area-time trade-offs studied in models based on very large-scale integration (VLSI) [37,49]. Our analysis is in many ways most similar to several recent theoretical exascale modeling studies [22,47], combined with trends analysis [34].…”
Section: Introductionmentioning
confidence: 99%