2020
DOI: 10.48550/arxiv.2011.00243
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

An analytic performance model for overlapping execution of memory-bound loop kernels on multicore CPUs

Ayesha Afzal,
Georg Hager,
Gerhard Wellein

Abstract: Complex applications running on multicore processors show a rich performance phenomenology. The growing number of cores per ccNUMA domain complicates performance analysis of memory-bound code since system noise, load imbalance, or task-based programming models can lead to thread desynchronization. Hence, the simplifying assumption that all cores execute the same loop can not be upheld. Motivated by observations on plain and modified versions of the HPCG benchmark, we construct a performance model of execution … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 8 publications
0
1
0
Order By: Relevance
“…Afzal et al [4,2,3,1] were the first to investigate the dynamics of idle waves, (de)synchronization processes, and computational wavefront formation in parallel programs with core-bound and memory-bound code, showing that nonlinear processes dominate there. Our work builds on theirs to significantly extend it for analytic modeling with further influence factors, such as communication topology, communication concurrency, system topology and noise structure.…”
Section: Related Workmentioning
confidence: 99%
“…Afzal et al [4,2,3,1] were the first to investigate the dynamics of idle waves, (de)synchronization processes, and computational wavefront formation in parallel programs with core-bound and memory-bound code, showing that nonlinear processes dominate there. Our work builds on theirs to significantly extend it for analytic modeling with further influence factors, such as communication topology, communication concurrency, system topology and noise structure.…”
Section: Related Workmentioning
confidence: 99%