Finding good configurations of a software system is often challenging since the number of configuration options can be large. Software engineers often make poor choices about configuration or, even worse, they usually use a sub-optimal configuration in production, which leads to inadequate performance. To assist engineers in finding the better configuration, this article introduces FLASH, a sequential model-based method that sequentially explores the configuration space by reflecting on the configurations evaluated so far to determine the next best configuration to explore. FLASH scales up to software systems that defeat the prior state-of-the-art model-based methods in this area. FLASH runs much faster than existing methods and can solve both single-objective and multi-objective optimization problems. The central insight of this article is to use the prior knowledge of the configuration space (gained from prior runs) to choose the next promising configuration. This strategy reduces the effort (i.e., number of measurements) required to find the better configuration. We evaluate FLASH using 30 scenarios based on 7 software systems to demonstrate that FLASH saves effort in 100% and 80% of cases in single-objective and multi-objective problems respectively by up to several orders of magnitude compared to state-of-the-art techniques.
Finding the optimally performing con guration of a so ware system for a given se ing is o en challenging. Recent approaches address this challenge by learning performance models based on a sample set of con gurations. However, building an accurate performance model can be very expensive (and is o en infeasible in practice).e central insight of this paper is that exact performance values (e.g., the response time of a so ware system) are not required to rank con gurations and to identify the optimal one. As shown by our experiments, performance models that are cheap to learn but inaccurate (with respect to the di erence between actual and predicted performance) can still be used rank con gurations and hence nd the optimal con guration. is novel rank-based approach allows us to signi cantly reduce the cost (in terms of number of measurements of sample con guration) as well as the time required to build performance models. We evaluate our approach with 21 scenarios based on 9 so ware systems and demonstrate that our approach is bene cial in 16 scenarios; for the remaining 5 scenarios, an accurate model can be built by using very few samples anyway, without the need for a rank-based approach.
With the advent of big data applications, which tends to have longer execution time, choosing the right cloud VM to run these applications has significant performance as well as economic implications. For example, in our large-scale empirical study of 107 different workloads on three popular big data systems, we found that a wrong choice can lead to a 20 times slowdown or an increase in cost by 10 times.Bayesian optimization is a technique for optimizing expensive (black-box) functions. Previous attempts have only used instancelevel information (such as # of cores, memory size) which is not sufficient to represent the search space. In this work, we discover that this may lead to the fragility problem-either incurs high search cost or finds only the sub-optimal solution. The central insight of this paper is to use low-level performance information to augment the process of Bayesian Optimization. Our novel low-level augmented Bayesian Optimization is rarely worse than current practices and often performs much better (in 46 of 107 cases). Further, it significantly reduces the search cost in nearly half of our case studies.Based on this work, we conclude that it is often insufficient to use general-purpose off-the-shelf methods for configuring cloud instances without augmenting those methods with essential systems knowledge such as CPU utilization, working memory size and I/O wait time.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.