Cloud computing is a well known paradigm, featuring on-demand provisioning of virtual machines (VMs) with different sizes and images. Systems deployed in the cloud can thus be heterogeneous, i.e., using multiple different types of VM instances. The immediate performance challenge arises: how to best use heterogeneous VM instances such that the experienced performance, i.e., response times, on different VMs are similar. In this paper, we first show to what extent deployed systems are heterogeneous in clouds by collecting data from operational data centers. We also show that response times suffer from high variance across replicas hosted on different VMs. We develop a novel plug&play workload controller, Join-the-Best-Queue (JBQ), which aims to reduce the variance and higher percentile of response-times in heterogeneous environments. With the aid of a testbed hosting part of wikipedia in an operational cloud, we show that JBQ can significantly reduce performance variability across different VMs, compared to prevailing load distributing policies in the Apache web server.
To operate systems cost-effectively, cloud providers not only multiplex applications on the shared infrastructure but also dynamically allocate available resources, such as power and cores. Data intensive applications based on the MapReduce paradigm rapidly grow in popularity and importance in the Cloud. Such big data applications typically have high fan-out of components and workload dynamics. It is no mean feat to deploy and further optimize application performance within (stringent) resource budgets. In this paper, we develop a novel solution, OptiCA, that eases the deployment of big data applications on cloud and the control of application components so that desired performance metrics can be best achieved for any given resource budgets, in terms of core capacities. The control algorithm of OptiCA distributes the available core budget across co-executed applications and components, based on their "effective" demands obtained through non-intrusive profiling. Our proposed solution is able to achieve robust performance, i.e., with very minor degradation, in cases where resource budget decreases rapidly.
Recent studies show that service systems hosted in clouds can elastically scale the provisioning of pre-configured virtual machines (VMs) with workload demands, but suffer from performance variability, particularly from varying response times. Service management in clouds is further complicated especially when aiming at striking an optimal trade-off between cost (i.e., proportional to the number and types of VM instances) and the fulfillment of quality-of-service (QoS) properties (e.g., a system should serve at least 30 requests per second for more than 90% of the time). In this paper, we develop a QoS-aware VM provisioning policy for service systems in clouds with high capacity variability, using experimental as well as modeling approaches. Using a wiki service hosted in a private cloud, we empirically quantify the QoS variability of a single VM with different configurations in terms of capacity. We develop a Markovian framework which explicitly models the capacity variability of a service cluster and derives a probability distribution of QoS fulfillment. To achieve the guaranteed QoS at minimal cost, we construct theoretical and numerical cost analyses, which facilitate the search for an optimal size of a given VM configuration, and additionally support the comparison between VM configurations.
Today's web services are commonly hosted on clusters of servers that are often located within computing clouds, whose computational and storage resources can be highly heterogeneous. The workload served typically exhibits disparate computation patterns (e.g., CPU-intensive or IO-intensive), that fluctuate both in terms of volume and mix. The system heterogeneity together with workload diversity further exacerbates the challenge of effective distribution of load within a computing cloud. This paper presents a novel, mixaware load-balancing algorithm, which aims to distribute requests sent by multiple applications in heterogeneous servers such that the application response times are minimized and system resources (e.g., CPU and IO) are equally utilized. To this end, the presented algorithm tries to not only balance the total number of requests seen by each server, but also to shape the requests received by each server into a certain "mix", that is analytically shown to be optimal for response time minimization. Our experimental resultsbased both on simulation and on a prototype implementationshow that the mix-aware algorithm achieves robust performance in most workload mixes as well as a consistent performance improvement in comparison with one of the most robust load-balancing schemes of the Apache server.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.