Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling

Ritter, Marcus; Calotoiu, Alexandru; Rinke, Sebastian; Reimann, Thorsten; Hoefler, Torsten; Wolf, Felix

doi:10.1109/ipdps47924.2020.00095

Cited by 11 publications

(14 citation statements)

References 36 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Their focus is mainly on obtaining (absolute) performance estimates for an algorithm in isolation, whose interpretability depends on the number of measurements (sample size) used to compute such estimates. While Ritter et al [22] proposed cost effective sampling strategies to build performance models, the focus is on identifying scalability bugs. By contrast, we aim to compute interpretable "relative" estimates of algorithms in comparison to one and another, which are not much influenced by sample size (see Sec.…”

Section: Related Workmentioning

confidence: 99%

Discriminating Equivalent Algorithms via Relative Performance

Sankaran,

Bientinesi

2020

Preprint

View full text Add to dashboard Cite

For a given linear algebra problem, we consider those solution algorithms that are mathematically equivalent to one another, and that mostly consist of a sequence of calls to kernels from optimized libraries such as BLAS and LAPACK.Although equivalent (at least in exact precision), those algorithms typically exhibit significant differences in terms of performance, and naturally, we are interested in finding the fastest one(s). In practice, we often observe that multiple algorithms yield comparable performance characteristics. Therefore, we aim to identify the subset of algorithms that are reliably faster than the rest. To this end, instead of quantifying the performance of an algorithm in absolute terms, we present a measurement-based approach that assigns a relative score to the algorithms in comparison to one another. The relative performance is encoded by sorting the algorithms based on pair-wise comparisons and ranking them into equivalence classes, where more than one algorithm can obtain the same rank. We show that the relative performance leads to robust identification of the fastest algorithms, that is, reliable identifications even with noisy system conditions.

show abstract

Section: Related Workmentioning

confidence: 99%

Discriminating Equivalent Algorithms via Relative Performance

Sankaran,

Bientinesi

2020

Preprint

View full text Add to dashboard Cite

show abstract

“…The coefficients of all hypotheses are automatically derived using regression and the hypothesis with the smallest error is chosen to find the most likely model function. In this work, we use the configuration suggested by Ritter et al [42]. This approach always generates a human-readable expression out of any given measurement data.…”

Section: Empirical Performance Modeling Is Also Hardmentioning

confidence: 99%

“…This means that the models can be affected both by random noise, and by systemic interference such as network congestion caused by multiple applications sharing a physical system. While these effects can be mitigated by repeating measurements and trying to control the measurement infrastructure, they cannot be eliminated and their impact is larger the more parameters are considered [42]. In most applications, runtime is concentrated in a small number of routines, and while these routines are correctly modeled, the previously discussed disturbances disproportionately affect regions of code with short runtimes, and in some cases translate to Extra-P effectively modeling noise.…”

Section: Search Spacementioning

confidence: 99%

Extracting Clean Performance Models from Tainted Programs

Copik¹,

Calotoiu²,

Grosser³

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

Performance models are well-known instruments to understand the scaling behavior of parallel applications. They express how performance changes as key execution parameters, such as the number of processes or the size of the input problem, vary. Besides reasoning about program behavior, such models can also be automatically derived from performance data. This is called empirical performance modeling. While this sounds simple at the first glance, this approach faces several serious interrelated challenges, including expensive performance measurements, inaccuracies inflicted by noisy benchmark data, and overall complex experiment design, starting with the selection of the right parameters. The more parameters one considers, the more experiments are needed and the stronger the impact of noise. In this paper, we show how taint analysis, a technique borrowed from the domain of computer security, can substantially improve the modeling process, lowering its cost, improving model quality, and help validate performance models and experimental setups.

show abstract

“…In practice, five values for each parameter ensure good model quality. Recent developments of Extra-P [3] allow accurate models to be generated by providing as few as ten measurements when two parameters are considered. Once the users have completed the experiment design, they collect measurements by running an instrumented version of the application in the selected configurations.…”

Section: B Clustering With Relative Distancementioning

confidence: 99%

“…repeated executions to counter noise. In this work, we leverage a recent enhancement of Extra-P, a heuristic [3] that gives users more freedom in the configuration of the measurement space when modeling multiple parameters simultaneously. Previous versions of the tool required all combinations of all values for each parameter to be considered, which, depending on the application, can be difficult.…”

Section: Introductionmentioning

confidence: 99%

Empirical Modeling of Spatially Diverging Performance

Calotoiu

Geisenhofer

Kummer

et al. 2020

2020 IEEE/ACM International Workshop on HPC User Support Tools (HUST) and Workshop on Programming and Performance Visualization

Self Cite

View full text Add to dashboard Cite

A common simplification made when modeling the performance of a parallel program is the assumption that the performance behavior of all processes or threads is largely uniform. Empirical performance-modeling tools such as Extra-P exploit this common pattern to make their modeling process more noise resilient, mitigating the effect of outliers by summarizing performance measurements of individual functions across all processes. While the underlying assumption does not equally hold for all applications, knowing the qualitative differences in how the performance of individual processes changes as execution parameters are varied can reveal important performance bottlenecks such as malicious patterns of load imbalance. A challenge for empirical modeling tools, however, arises from the fact that the behavioral class of a process may depend on the process configuration, letting process ranks migrate between classes as the number of processes grows. In this paper, we introduce a novel approach to the problem of modeling of spatially diverging performance based on a certain type of process clustering. We apply our technique to identify a previously unknown performance bottleneck in the BoSSS fluid-dynamics code. Removing it made the code regions in question running up to 20 times and the application as a whole run up to 4.5 times faster.

show abstract

Learning Cost-Effective Sampling Strategies for Empirical Performance Modeling

Cited by 11 publications

References 36 publications

Discriminating Equivalent Algorithms via Relative Performance

Discriminating Equivalent Algorithms via Relative Performance

Extracting Clean Performance Models from Tainted Programs

Empirical Modeling of Spatially Diverging Performance

Contact Info

Product

Resources

About