2019 IEEE 27th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS 2019
DOI: 10.1109/mascots.2019.00035
|View full text |Cite
|
Sign up to set email alerts
|

Initial Experiments with Duet Benchmarking: Performance Testing Interference in the Cloud

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(9 citation statements)
references
References 15 publications
0
9
0
Order By: Relevance
“…In our context, this means if an A/A test between static configuration and dynamic reconfiguration (for each stoppage criterion) does not report a difference, we conclude that dynamic reconfiguration does not change the benchmark result. Following performance engineering best practice [9,10,27,33], we estimate the confidence interval for the ratio of means with bootstrap [13], using 10,000 iterations [21], and employing hierarchical random resampling with replacement on (1) invocation, (2) iteration, and (3) fork level [27] (again relying on pa [31]). If the confidence interval (of the ratio) straddles 1, there is no statistically significant difference.…”
Section: Rqmentioning
confidence: 99%
See 1 more Smart Citation
“…In our context, this means if an A/A test between static configuration and dynamic reconfiguration (for each stoppage criterion) does not report a difference, we conclude that dynamic reconfiguration does not change the benchmark result. Following performance engineering best practice [9,10,27,33], we estimate the confidence interval for the ratio of means with bootstrap [13], using 10,000 iterations [21], and employing hierarchical random resampling with replacement on (1) invocation, (2) iteration, and (3) fork level [27] (again relying on pa [31]). If the confidence interval (of the ratio) straddles 1, there is no statistically significant difference.…”
Section: Rqmentioning
confidence: 99%
“…The results and implications from RQ 1 are based on the notion of benchmark result similarity. We assess this through statistical A/A tests (based on bootstrap confidence intervals for the ratio of means) and mean performance change rate, similar to previous work [10,33]. Other tests for the similarity of benchmark results, such as non-parametric hypothesis tests and effect sizes [12,33], might lead to different outcomes.…”
Section: Threats To Validitymentioning
confidence: 99%
“…This, however, can be problematic as it neglects the distribution of the performance measurements. Performance measurement results are known to often be non-normally distributed (Curtsinger and Berger 2013) (e.g., long-tailed or multimodal), and best practice suggests using bootstrap confidence intervals instead of simple average statistics, such as the mean (Kalibera and Jones 2012;Bulej et al 2017;Bulej et al 2019;Stefan et al 2017;Wang et al 2018;He et al 2019;Laaber et al 2019;Laaber et al 2020). Consequently, we update Mostafa et al (2017)'s definition of a performance change to use bootstrap confidence intervals.…”
Section: Performance Changesmentioning
confidence: 99%
“…We derive the confidence interval for the ratio of task execution times, which describes the relative performance of the two workloads, using a Monte Carlo procedure based on standard bootstrap confidence interval computation [12], explained in detail in [4]:…”
Section: Duet Measurement Proceduresmentioning
confidence: 99%
“…Our earlier work [4] introduced the idea of the duet measurement procedure, which improves measurement accuracy in shared resource environments, such as virtual machine instances in the cloud. The procedure is based on the assumption that performance fluctuations due to interference tend to impact similar tenants equally, and attempts to maximize the likelihood of such equal impact by executing the measured artifacts in parallel.…”
Section: Introductionmentioning
confidence: 99%