2020
DOI: 10.1109/ms.2020.2987024
|View full text |Cite
|
Sign up to set email alerts
|

The Interplay of Sampling and Machine Learning for Software Performance Prediction

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
30
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 57 publications
(30 citation statements)
references
References 16 publications
0
30
0
Order By: Relevance
“…Most existing techniques build global performanceinfluence models and treat the system as a black box, measuring the system's execution in an environment with a given workload for a subset of all configurations, and learning a model from these observations. The sampling (i.e., selecting which configurations to measure) and learning techniques used [15,17,35,36,49,61,[63][64][65] result in tradeoffs among the cost to build the models and the accuracy and interpretability of the models [15,35,38]. For example, larger samples are more expensive, but usually lead to more accurate models; random forests, with large enough samples, tend to learn more accurate models than those built with linear regression, but the models are harder to interpret when users want to understand performance or debug their systems [15,35,49] (see Fig.…”
Section: Introductionmentioning
confidence: 99%
See 4 more Smart Citations
“…Most existing techniques build global performanceinfluence models and treat the system as a black box, measuring the system's execution in an environment with a given workload for a subset of all configurations, and learning a model from these observations. The sampling (i.e., selecting which configurations to measure) and learning techniques used [15,17,35,36,49,61,[63][64][65] result in tradeoffs among the cost to build the models and the accuracy and interpretability of the models [15,35,38]. For example, larger samples are more expensive, but usually lead to more accurate models; random forests, with large enough samples, tend to learn more accurate models than those built with linear regression, but the models are harder to interpret when users want to understand performance or debug their systems [15,35,49] (see Fig.…”
Section: Introductionmentioning
confidence: 99%
“…• A replication package with subject systems, experimental setup, and data of several months of measurements [74]. There is substantial literature on modeling the performance of software systems [e.g., 15,35,38,75]. Performanceinfluence models solve a specific problem: Explaining how options and their interactions influence a system's performance for a given workload and environment, designed to help users understand performance and make deliberate configuration decisions.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations