2016
DOI: 10.1007/978-3-319-32149-3_7
|View full text |Cite
|
Sign up to set email alerts
|

Performance Analysis of the Kahan-Enhanced Scalar Product on Current Multicore Processors

Abstract: Abstract. We investigate the performance characteristics of a numerically enhanced scalar product (dot) kernel loop that uses the Kahan algorithm to compensate for numerical errors, and describe efficient SIMD-vectorized implementations on recent Intel processors. Using low-level instruction analysis and the execution-cache-memory (ECM) performance model we pinpoint the relevant performance bottlenecks for single-core and thread-parallel execution, and predict performance and saturation behavior. We show that … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2
1

Relationship

4
3

Authors

Journals

citations
Cited by 9 publications
(12 citation statements)
references
References 17 publications
0
12
0
Order By: Relevance
“…Here we evaluate the correctness of predictions derived by Kerncraft from kernel codes and hardware descriptions. It is out of the scope of this work to evaluate the underlying performance models; this has been discussed elsewhere [4,23,8,18,11,10]. We will, however, compare predictions by Kerncraft to predictions derived by manual analysis in previously published papers (see Table 5) and point out relevant differences and peculiarities.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Here we evaluate the correctness of predictions derived by Kerncraft from kernel codes and hardware descriptions. It is out of the scope of this work to evaluate the underlying performance models; this has been discussed elsewhere [4,23,8,18,11,10]. We will, however, compare predictions by Kerncraft to predictions derived by manual analysis in previously published papers (see Table 5) and point out relevant differences and peculiarities.…”
Section: Discussionmentioning
confidence: 99%
“…Here we will have a look at the Kahan-compensated double-precision dot product and the Schönauer Triad. These have been analyzed thoroughly in [11] and [8], respectively.…”
Section: Streaming Kernelsmentioning
confidence: 99%
See 2 more Smart Citations
“…The ECM model [19,3,18,4,5] is an analytic performance model that, with the exception of sustained memory bandwidth, works exclusively with architecture specifications as inputs. The model estimates the numbers of CPU cycles required to execute a number of iterations of a loop on a single core of a multi-or manycore chip.…”
Section: The Ecm Performance Modelmentioning
confidence: 99%