In resource provisioning for cloud computing, an important issue is how resources may be allocated to an application mix such that the service level agreements (SLAs) of all applications are met. A performance model with two interactive job classes is used to determine the smallest number of servers required to meet the SLAs of both classes. For each class, the SLA is specified by the relationship: Prob [response time ≤ x] ≥ y. Two server allocation strategies are considered: shared allocation (SA) and dedicated allocation (DA). For the case of FCFS scheduling, analytic results for response time distribution are used to develop a heuristic algorithm that determines an allocation strategy (SA or DA) that requires the smallest number of servers. The effectiveness of this algorithm is evaluated over a range of operating conditions. The performance of SA with non-FCFS scheduling is also investigated. Among the scheduling disciplines considered, a new discipline called probability dependent priority is found to have the best performance in terms of requiring the smallest number of servers.
An elasticity policy governs how and when resources (e.g., application server instances at the PaaS layer) are added to and/or removed from a cloud environment. The elasticity policy can be implemented as a conventional control loop or as a set of heuristic rules. In the control-theoretic approach, complex constructs such as tracking filters, estimators, regulators, and controllers are utilized. In the heuristic, rule-based approach, various alerts (e.g., events) are defined on instance metrics (e.g., CPU utilization), which are then aggregated at a global scale in order to make provisioning decisions for a given application tier. This work provides an overview of our experiences designing and working with both approaches to construct an autoscaler for simple applications. We enumerate different criteria such as design complexity, ease of comprehension, and maintenance upon which we form an informal comparison between the different methods. We conclude with a brief discussion of how these approaches can be used in the governance of resources to better meet a high-level goal over time.
An application provider leases resources (i.e., virtual machine instances) of variable configurations from a IaaS provider over some lease duration (typically one hour). The application provider (i.e., consumer) would like to minimize their cost while meeting all service level obligations (SLOs). The mechanism of adding and removing resources at runtime is referred to as autoscaling. The process of autoscaling is automated through the use of a management component referred to as an autoscaler. This paper introduces a novel autoscaling approach in which both cloud and application dynamics are modeled in the context of a stochastic, model predictive control problem. The approach exploits tradeoff between satisfying performance related objectives for the consumer's application while minimizing their cost. Simulation results are presented demonstrating the efficacy of this new approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.