Proceedings of the ACM International Conference on Supercomputing 2019
DOI: 10.1145/3330345.3330376
|View full text |Cite
|
Sign up to set email alerts
|

Efficient thread/page/parallelism autotuning for NUMA systems

Abstract: Current multi-socket systems have complex memory hierarchies with significant Non-Uniform Memory Access (NUMA) effects: memory performance depends on the location of the data and the thread. This complexity means that thread-and data-mappings have a significant impact on performance. However, it is hard to find efficient data mappings and thread configurations due to the complex interactions between applications and systems. In this paper we explore the combined search space of thread mappings, data mappings, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
22
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 30 publications
(22 citation statements)
references
References 33 publications
0
22
0
Order By: Relevance
“…Codelet execution extracts hot regions from the application as small, representative codelets and uses them to characterize the application's performance. Codelets are on average 66× faster [35] to evaluate than running the full application. Codelet execution is faster because it only executes a few instances of each region (instead of hundreds during the original run).…”
Section: Faster Evaluation: Sampling With Codeletsmentioning
confidence: 99%
See 4 more Smart Citations
“…Codelet execution extracts hot regions from the application as small, representative codelets and uses them to characterize the application's performance. Codelets are on average 66× faster [35] to evaluate than running the full application. Codelet execution is faster because it only executes a few instances of each region (instead of hundreds during the original run).…”
Section: Faster Evaluation: Sampling With Codeletsmentioning
confidence: 99%
“…Codelets have been shown to be quite accurate for both microarchitectural evaluation [12] and NUMA configuration studies [35]. This is because parallel regions typically exhibit similar behavior [40].…”
Section: Faster Evaluation: Sampling With Codeletsmentioning
confidence: 99%
See 3 more Smart Citations