2021
DOI: 10.1002/spe.3056
|View full text |Cite
|
Sign up to set email alerts
|

NAS Parallel Benchmarks with CUDA and beyond

Abstract: NAS Parallel Benchmarks (NPB) is a standard benchmark suite used in the evaluation of parallel hardware and software. Several research efforts from academia have made these benchmarks available with different parallel programming models beyond the original versions with OpenMP and MPI. This work joins these research efforts by providing a new CUDA implementation for NPB. Our contribution covers different aspects beyond the implementation. First, we define design principles based on the best programming practic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
5
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 18 publications
(8 citation statements)
references
References 12 publications
0
5
0
Order By: Relevance
“…There is an ongoing effort to create SkePU implementations, and subsequently evaluations, of many benchmark workloads across several benchmark suites. Such suites include Rodinia [10], PARSEC [9] and its parallel derivate P3ARSEC [15], Poly-Bench [34], and NAS Parallel Benchmarks [5,8,27]. The complexity and effort required for benchmarking parallel programming models, interfaces, and frameworks is well-known [32] and examples of ongoing efforts to simplify and standardize parallel benchmark suites are many, including P3ARSEC and Task Bench.…”
Section: Benchmarksmentioning
confidence: 99%
See 2 more Smart Citations
“…There is an ongoing effort to create SkePU implementations, and subsequently evaluations, of many benchmark workloads across several benchmark suites. Such suites include Rodinia [10], PARSEC [9] and its parallel derivate P3ARSEC [15], Poly-Bench [34], and NAS Parallel Benchmarks [5,8,27]. The complexity and effort required for benchmarking parallel programming models, interfaces, and frameworks is well-known [32] and examples of ongoing efforts to simplify and standardize parallel benchmark suites are many, including P3ARSEC and Task Bench.…”
Section: Benchmarksmentioning
confidence: 99%
“…The original version was written in Fortran and the parallel implementations were in OpenMP and MPI. In recent years, an effort was made to provide parallel versions for C/C++ parallel programming frameworks on multicore systems [26,27] as well as heterogeneous parallel programming on GPUs [5,6,19].…”
Section: Nas Parallel Benchmarksmentioning
confidence: 99%
See 1 more Smart Citation
“…No manual development and no special Field Programmable Gate Array (FPGA) or programming knowledge are required. The logic generated by this improved approach is up to 43 times faster than its hand-optimized High Level Synthesis (HLS) counterpart, depending on the solution method.The third paper titled "NAS Parallel Benchmarks with Compute Unified Device Architecture (CUDA) and Beyond" by Fernandes et al 3 provides a new CUDA implementation for NASA Parallel Benchmark (NPB). The performance results have shown up to 267% improvements over the best benchmark versions available.…”
mentioning
confidence: 99%
“…The third paper titled “NAS Parallel Benchmarks with Compute Unified Device Architecture (CUDA) and Beyond” by Fernandes et al 3 provides a new CUDA implementation for NASA Parallel Benchmark (NPB). The performance results have shown up to 267% improvements over the best benchmark versions available.…”
mentioning
confidence: 99%