2016 7th International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS 2016
DOI: 10.1109/pmbs.2016.010
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating and Optimizing the NERSC Workload on Knights Landing

Abstract: Abstract-NERSC has partnered with 20 representative application teams to evaluate performance on the Xeon-Phi Knights Landing architecture and develop an application-optimization strategy for the greater NERSC workload on the recently installed Cori system. In this article, we present early case studies and summarized results from a subset of the 20 applications highlighting the impact of important architecture differences between the Xeon-Phi and traditional Xeon processors. We summarize the status of the app… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
30
0

Year Published

2017
2017
2020
2020

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 27 publications
(30 citation statements)
references
References 35 publications
0
30
0
Order By: Relevance
“…Rosales et al [26] study the performance differences observed when using flat, cache and hybrid HBM configurations together with the effect of memory affinity and process pinning in Mantevo suite and NAS parallel benchmark. Barnes et al [27] present the initial performance results of 20 NESAP scientific applications running on KNL nodes of the Cori system and comparing KNL hardware features with traditional Intel Xeon architectures. This study mainly targets at how to effectively run NESAP applications in the Cori system whereas we focus on giving general guidelines on what kind of applications characteristics benefit from running on a hybrid memory system.…”
Section: Related Workmentioning
confidence: 99%
“…Rosales et al [26] study the performance differences observed when using flat, cache and hybrid HBM configurations together with the effect of memory affinity and process pinning in Mantevo suite and NAS parallel benchmark. Barnes et al [27] present the initial performance results of 20 NESAP scientific applications running on KNL nodes of the Cori system and comparing KNL hardware features with traditional Intel Xeon architectures. This study mainly targets at how to effectively run NESAP applications in the Cori system whereas we focus on giving general guidelines on what kind of applications characteristics benefit from running on a hybrid memory system.…”
Section: Related Workmentioning
confidence: 99%
“…The NESAP optimization efforts and results are documented in more detail in Barnes et al 16 and Kurth et al 17 3 | HETEROGENEOUS PROGRAMMING ENVIRONMENT The heterogeneous Haswell/KNL system is considerably more complex than a homogeneous Xeon cluster. We describe some challenges encountered and the recommendations we formed around building applications with cross-compilation and binary compatibility.…”
Section: Nesap Resultsmentioning
confidence: 99%
“…The NESAP optimization efforts and results are documented in more detail in Barnes et al and Kurth et al…”
Section: Nesapmentioning
confidence: 99%
“…Unlike GPUs and FPGAs, which mandate that code developers use specialized programming models, KNL facilitates application development and porting by providing standard language support (C, C++, Fortran, etc), familiar parallel programming models, and extensive compiler support. However, to obtain maximum performance on the KNL, significant refactoring and optimization of application codes are required to exploit key architectural innovations that KNL features—wide vector units, many‐core node design, and deep memory hierarchy . In this paper, the experience and insights gained in porting FEFLO (finite element code for the solution of compressible and incompressible flows) to the KNL platform are presented.…”
Section: Introductionmentioning
confidence: 99%
“…However, to obtain maximum performance on the KNL, significant refactoring and optimization of application codes are required to exploit key architectural innovations that KNL features-wide vector units, many-core node design, and deep memory hierarchy. [1][2][3][4][5][6] In this paper, the experience and insights gained in porting FEFLO (finite element code for the solution of compressible and incompressible flows) to the KNL platform are presented. FEFLO is a typical large-scale, production legacy code that has previously been ported and run on vector and GPU hardware.…”
Section: Introductionmentioning
confidence: 99%