2014
DOI: 10.1145/2588788
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing Performance Optimization of Multicore/Multichip Nodes with Data Structure Metrics

Abstract: Program performance optimization is usually based solely on measurements of execution behavior of code segments using hardware performance counters. However, memory access patterns are critical performance limiting factors for today's multicore chips where performance is highly memory bound. Therefore diagnoses and selection of optimizations based only on measurements of the execution behavior of code segments are incomplete because they do not incorporate knowledge of memory access patterns and behaviors. Thi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(3 citation statements)
references
References 14 publications
0
3
0
Order By: Relevance
“…On today’s multi-core chips, memory access is a critical performance-limiting factor [ 31 ]. Therefore, we have analyzed software behavior and memory access patterns with a profiling tool for high-performance computing applications, PerfExpert [ 30 ].…”
Section: Resultsmentioning
confidence: 99%
“…On today’s multi-core chips, memory access is a critical performance-limiting factor [ 31 ]. Therefore, we have analyzed software behavior and memory access patterns with a profiling tool for high-performance computing applications, PerfExpert [ 30 ].…”
Section: Resultsmentioning
confidence: 99%
“…The increasing complexity of multi‐core and multi‐socket nodes makes performance optimization even more cumbersome. There have been many attempts of development of a more or less universal framework for optimization of algorithms that takes into account the main features of hardware . For example, the pitfalls encountered when trying to characterize both the network and the memory performance of modern machines have been emphasized recently …”
Section: Related Workmentioning
confidence: 99%
“…There have been many attempts of development of a more or less universal framework for optimization of algorithms that takes into account the main features of hardware. 20,21 For example, the pitfalls encountered when trying to characterize both the network and the memory performance of modern machines have been emphasized recently. 22 The paper of Kutzner et al 23 considers the selection of a best option among alternative GPU-systems for running the GROMACS package.…”
Section: Related Workmentioning
confidence: 99%