2015
DOI: 10.1007/978-3-319-17248-4_3
|View full text |Cite
|
Sign up to set email alerts
|

SPEC ACCEL: A Standard Application Suite for Measuring Hardware Accelerator Performance

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
21
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 46 publications
(21 citation statements)
references
References 20 publications
0
21
0
Order By: Relevance
“…First, we evaluate a set of microbenchmarks to measure the effect of each of our proposed optimizations in isolation and combination. Second, to get complete end-toend performance numbers, we run workloads from SpecAC-CEL [31] and graph500 [52], and we show the performance across a range of fast memory oversubscription scenarios. Third, we sweep the design space to highlight the interesting behaviors that arise and to identify the configuration parameters that perform the best.…”
Section: Methodsmentioning
confidence: 99%
“…First, we evaluate a set of microbenchmarks to measure the effect of each of our proposed optimizations in isolation and combination. Second, to get complete end-toend performance numbers, we run workloads from SpecAC-CEL [31] and graph500 [52], and we show the performance across a range of fast memory oversubscription scenarios. Third, we sweep the design space to highlight the interesting behaviors that arise and to identify the configuration parameters that perform the best.…”
Section: Methodsmentioning
confidence: 99%
“…We use all the 19 OpenCL benchmarks in SPECACCEL-v1.2 [28], with each benchmark including one or multiple kernels. For each benchmark, we use its test, train and ref inputs, and present the execution times of the whole program and OpenCL kernels.…”
Section: Discussionmentioning
confidence: 99%
“…• OpenCL-specific parameter-guided interprocedural analysis (IPA): We propose the IPA for analyzing the memory objects accessed in both the host and kernel codes, and find new optimization opportunities of static tiling for irregular accesses and using re-computation for saving SPM capacity. • We implement the bandwidth-aware loop tiling approach SWCL, and evaluate it using the SPECACCEL [28] benchmark suite. Experimental results demonstrate that it can bring significant performance improvement, i.e., up to 4x, with a geometric average of 26%.…”
Section: Introductionmentioning
confidence: 99%
“…The Polybench [28] and SPEC ACCEL [13] OpenMP 4 benchmark suites are used to evaluate the efficacy of the coalescing-analysis-informed loop reshaping of OpenMP 4.x parallel loop nests. Execution times are reported for two experimental setup machines: an IBM POWER8 host with an Nvidia P100 GPU and an IBM POWER9 host with an Nvidia V100 GPU accelerator.…”
Section: Informed Loop Reshaping Performance Impactmentioning
confidence: 99%