2012 International Conference for High Performance Computing, Networking, Storage and Analysis 2012
DOI: 10.1109/sc.2012.69
|View full text |Cite
|
Sign up to set email alerts
|

Hybridizing S3D into an Exascale application using OpenACC: An approach for moving to multi-petaflops and beyond

Abstract: Hybridization is the process of converting an application with a single level of parallelism to an application with multiple levels of parallelism. Over the past 15 years a majority of the applications that run on High Performance Computing systems have employed MPI for all of the parallelism within the application. In the Peta-Exascale computing regime, effective utilization of the hardware requires multiple levels of parallelism matched to the macro architecture of the system to achieve good performance. A h… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0
1

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 36 publications
(26 citation statements)
references
References 5 publications
0
25
0
1
Order By: Relevance
“…Thanks to the support from numerous vendors, OpenACC and OpenMP have rapidly established themselves as the de facto solutions for directive-based code development. Although capable of delivering acceptable performance [44,19] in a broad range of applications, neither OpenACC nor OpenMP targets an entire cluster. Users are thus left to their own to write code that deals with MPI.…”
Section: Compiler Directivesmentioning
confidence: 99%
“…Thanks to the support from numerous vendors, OpenACC and OpenMP have rapidly established themselves as the de facto solutions for directive-based code development. Although capable of delivering acceptable performance [44,19] in a broad range of applications, neither OpenACC nor OpenMP targets an entire cluster. Users are thus left to their own to write code that deals with MPI.…”
Section: Compiler Directivesmentioning
confidence: 99%
“…Hart et al [8] used Co-Array Fortran (CAF) + OpenACC and Levesque et al [10] used MPI + OpenACC to utilize multi-GPU in GPU cluster. The inter-GPU communication in these cases are managed using approach similar to distributed shared memory model.…”
Section: Related Workmentioning
confidence: 99%
“…Since these are a small subset of the total fields in a Cell, there is a significant improvement in performance by only moving the needed field data. We note that a version of S3D using OpenACC [19] can perform a similar operation, under the condition that data is laid out in system memory using a struct-of-arrays format so individual field data is dense, allowing OpenACC to copy individual dimensions of the array. Entangling layout with data movement optimizations in the OpenACC code results in code that is difficult to modify when exploring different mapping strategies and tuning for new architectures.…”
Section: Motivationmentioning
confidence: 99%
“…We compare against two versions of S3D: a CPU-only version and an improved hybrid version from [19] that uses OpenACC.…”
Section: S3dmentioning
confidence: 99%
See 1 more Smart Citation