Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores 2016
DOI: 10.1145/2883404.2883420
|View full text |Cite
|
Sign up to set email alerts
|

An Evaluation of Emerging Many-Core Parallel Programming Models

Abstract: In this work we directly evaluate several emerging parallel programming models: Kokkos, RAJA, OpenACC, and OpenMP 4.0, against the mature CUDA and OpenCL APIs. Each model has been used to port TeaLeaf, a miniature proxy application, or miniapp, that solves the heat conduction equation, and belongs to the Mantevo suite of applications. We find that the best performance is achieved with device-tuned implementations but that, in many cases, the performance portable models are able to solve the same problems to wi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
30
0

Year Published

2016
2016
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 38 publications
(31 citation statements)
references
References 18 publications
1
30
0
Order By: Relevance
“…To the authors knowledge, the only study that has compared the same simple benchmark in all the programming models of interest across a wide range of devices is one they themselves performed, where the TeaLeaf heat diffusion miniapp from the Mantevo benchmark suite was used in a similar manner to measure performance portability [9,6].…”
Section: Related Workmentioning
confidence: 99%
“…To the authors knowledge, the only study that has compared the same simple benchmark in all the programming models of interest across a wide range of devices is one they themselves performed, where the TeaLeaf heat diffusion miniapp from the Mantevo benchmark suite was used in a similar manner to measure performance portability [9,6].…”
Section: Related Workmentioning
confidence: 99%
“…Lin et al [7] used the ROSE source-to-source compiler to port a number of stencil applications, investigating performance and productivity. In our previous work, we compared the performance of a number of parallel programming models, including OpenMP 4.0, Kokkos, and RAJA [8]. We later discussed the performance of OpenMP 4.0 ports of the TeaLeaf, CloverLeaf, and BUDE mini-apps on NVIDIA GPUs [9].…”
Section: Concluding Suggestions For Performance Portabilitymentioning
confidence: 99%
“…Faced with the plethora of parallel programming models currently available, we expect many developers will see OpenMP 4.x as a familiar and attractive option that can balance performance, portability, productivity and maintainability [8]. Of course, there are no guarantees of performance portability offered by the specification and the divergence of existing implementations means that it is currently possible to write code that is non-portable between different implementations even targeting the same architecture.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Martineau et al [5], [18], [19] discuss several variants of TeaLeaf that have been parallelised using a number of programming models. Further, they compare different solvers within TeaLeaf: Conjugate Gradient (CG), Chebyshev and Chebyshev polynomially preconditioned CG (PPCG), on three different Intel Xeon processors, an IBM Power8 processor, an NVIDIA Tesla K20x GPU and an Intel Knights Corner accelerator card [5], [18], [19]. Recently, TeaLeaf was reengineered to use the OPS [6] embedded domain specific language, and the Kokkos [7] and RAJA [8] C++ template libraries.…”
Section: Introductionmentioning
confidence: 99%