2017
DOI: 10.1145/3148054
|View full text |Cite
|
Sign up to set email alerts
|

Fuse

Abstract: Collecting hardware event counts is essential to understanding program execution behavior. Contemporary systems offer few Performance Monitoring Counters (PMCs), thus only a small fraction of hardware events can be monitored simultaneously. We present new techniques to acquire counts for all available hardware events with high accuracy by multiplexing PMCs across multiple executions of the same program, then carefully reconciling and merging the multiple profiles into a single, coherent profile. We present a n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 22 publications
0
4
0
Order By: Relevance
“…These methods require the entire trace of an application before providing corrections and cannot be run in real time. For example, Lv et al 37 use the Gumbel test for outlier detection, and Neill et al 38 use fork‐join aware agglomerative clustering to remove outlier points. These methods are unsuitable for dynamic control situations requiring online HPC correction.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…These methods require the entire trace of an application before providing corrections and cannot be run in real time. For example, Lv et al 37 use the Gumbel test for outlier detection, and Neill et al 38 use fork‐join aware agglomerative clustering to remove outlier points. These methods are unsuitable for dynamic control situations requiring online HPC correction.…”
Section: Related Workmentioning
confidence: 99%
“…At the end of that function, the old process can be identified via the prev parameter, while the newly-scheduled one can be already reached via current. We provide in Listing 3 the reference code to install a context switch callback by relying on kprobes-the install_kprobe function (lines [30][31][32][33][34][35][36][37][38][39][40][41][42]. In particular, we rely on a kretprobe to be notified when the finish_task_switch function is returning.…”
Section: Per Thread Profilingmentioning
confidence: 99%
“…Several works in the mainstream (high-performance) domain reason on the sources of variability in HEM values when executing several times the same piece of software. This covers from the operating system noise [119], application variability [7,121] and the particular HEM-Reading library, to the complexity of the hardware [184]. For instance, [119] focuses on the cycle count HEM and shows that its variability is often related to the executable layout and operating system issues.…”
Section: State-of-the-art On Hem Analysismentioning
confidence: 99%
“…In our work, we use no operating system and access directly, with no library, the HEMs (via the PMCs) so they are not subject to software-induced variability. In [121], authors focus on task-parallel programs in high-performance environments with highly-dynamic execution conditions, including dynamic task scheduling, that cause tasks to execute in different orders and in different cores across executions. Authors propose techniques to determine which HEM readings belong to each task and hence, combine them to derive all HEMs for a task.…”
Section: State-of-the-art On Hem Analysismentioning
confidence: 99%