2015
DOI: 10.1007/978-3-319-17473-0_5
|View full text |Cite
|
Sign up to set email alerts
|

NAS Parallel Benchmarks for GPGPUs Using a Directive-Based Programming Model

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0
3

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(38 citation statements)
references
References 11 publications
0
35
0
3
Order By: Relevance
“…We could easily recognize various suboptimal performance trends. For the OpenACC port of the NPB [19], we observe that LU and FT do not significantly improve with the increase in SM count, while MG improves and then drop in performance at high SM count. The performance does not improve significantly with the increase of problem size from Class A to B or C. For the CUDA variant of NPB [9], LU has a significant improvement with SM count.…”
Section: Motivationmentioning
confidence: 87%
“…We could easily recognize various suboptimal performance trends. For the OpenACC port of the NPB [19], we observe that LU and FT do not significantly improve with the increase in SM count, while MG improves and then drop in performance at high SM count. The performance does not improve significantly with the increase of problem size from Class A to B or C. For the CUDA variant of NPB [9], LU has a significant improvement with SM count.…”
Section: Motivationmentioning
confidence: 87%
“…Xu et al [13] focused on directive-based parallelization of NPB benchmarks. After analyzing and profiling the OpenMP version of NPB, they annotate the source code with OpenACC directives to automatically generate GPU versions of the benchmarks.…”
Section: Related Workmentioning
confidence: 99%
“…The NPB has been implemented for diverse parallel programming models over the years, which include OpenMP [22], MPI [23], [24], and Multi-Zone [25]. In particular, parallel versions for accelerators include implementations in OpenCL [26] and OpenACC [27]. The CUDA version for the complete NPB kernels and applications has not been reported yet.…”
Section: ) Nas Parallel Benchmarks (Npb)mentioning
confidence: 99%