2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2020
DOI: 10.1109/ipdps47924.2020.00095
|View full text |Cite
|
Sign up to set email alerts
|

Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling

Abstract: Identifying scalability bottlenecks in parallel applications is a vital but also laborious and expensive task. Empirical performance models have proven to be helpful to find such limitations, though they require a set of experiments in order to gain valuable insights. Therefore, the experiment design determines the quality and cost of the models. Extra-P is an empirical modeling tool that uses small-scale experiments to assess the scalability of applications. Its current version requires an exponential number … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 11 publications
(14 citation statements)
references
References 36 publications
0
14
0
Order By: Relevance
“…Their focus is mainly on obtaining (absolute) performance estimates for an algorithm in isolation, whose interpretability depends on the number of measurements (sample size) used to compute such estimates. While Ritter et al [22] proposed cost effective sampling strategies to build performance models, the focus is on identifying scalability bugs. By contrast, we aim to compute interpretable "relative" estimates of algorithms in comparison to one and another, which are not much influenced by sample size (see Sec.…”
Section: Related Workmentioning
confidence: 99%
“…Their focus is mainly on obtaining (absolute) performance estimates for an algorithm in isolation, whose interpretability depends on the number of measurements (sample size) used to compute such estimates. While Ritter et al [22] proposed cost effective sampling strategies to build performance models, the focus is on identifying scalability bugs. By contrast, we aim to compute interpretable "relative" estimates of algorithms in comparison to one and another, which are not much influenced by sample size (see Sec.…”
Section: Related Workmentioning
confidence: 99%
“…The coefficients of all hypotheses are automatically derived using regression and the hypothesis with the smallest error is chosen to find the most likely model function. In this work, we use the configuration suggested by Ritter et al [42]. This approach always generates a human-readable expression out of any given measurement data.…”
Section: Empirical Performance Modeling Is Also Hardmentioning
confidence: 99%
“…This means that the models can be affected both by random noise, and by systemic interference such as network congestion caused by multiple applications sharing a physical system. While these effects can be mitigated by repeating measurements and trying to control the measurement infrastructure, they cannot be eliminated and their impact is larger the more parameters are considered [42]. In most applications, runtime is concentrated in a small number of routines, and while these routines are correctly modeled, the previously discussed disturbances disproportionately affect regions of code with short runtimes, and in some cases translate to Extra-P effectively modeling noise.…”
Section: Search Spacementioning
confidence: 99%
“…In practice, five values for each parameter ensure good model quality. Recent developments of Extra-P [3] allow accurate models to be generated by providing as few as ten measurements when two parameters are considered. Once the users have completed the experiment design, they collect measurements by running an instrumented version of the application in the selected configurations.…”
Section: B Clustering With Relative Distancementioning
confidence: 99%
“…repeated executions to counter noise. In this work, we leverage a recent enhancement of Extra-P, a heuristic [3] that gives users more freedom in the configuration of the measurement space when modeling multiple parameters simultaneously. Previous versions of the tool required all combinations of all values for each parameter to be considered, which, depending on the application, can be difficult.…”
Section: Introductionmentioning
confidence: 99%