Abstract-Cloud Computing provides a convenient means of remote on-demand and pay-per-use access to computing resources. However, its ad hoc management of quality-of-service and SLA poses significant challenges to the performance, dependability and costs of online cloud services. The paper precisely addresses this issue and makes a threefold contribution. First, it introduces a new cloud model, the SLAaaS (SLA aware Service) model. SLAaaS enables a systematic integration of QoS levels and SLA into the cloud. It is orthogonal to other cloud models such as SaaS or PaaS, and may apply to any of them. Second, the paper introduces CSLA, a novel language to describe QoS-oriented SLA associated with cloud services. Third, the paper presents a control-theoretic approach to provide performance, dependability and cost guarantees for online cloud services, with time-varying workloads. The proposed approach is validated through case studies and extensive experiments with online services hosted in clouds such as Amazon EC2. The case studies illustrate SLA guarantees for various services such as a MapReduce service, a cluster-based multi-tier e-commerce service, and a low-level locking service.
International audienceWe are at the dawn of a huge data explosion therefore companies have fast growing amounts of data to process. For this purpose Google developed MapReduce, a parallel programming paradigm which is slowly becoming the de facto tool for Big Data analytics. Although to some extent its use is already wide-spread in the industry, ensuring performance constraints for such a complex system poses great challenges and its management requires a high level of expertise. This paper answers these challenges by providing the first autonomous controller that ensures service time constraints of a concurrent MapReduce workload. We develop the first dynamic model of a MapReduce cluster. Furthermore, PI feedback control is developed and implemented to ensure service time constraints. A feedforward controller is added to improve control response in the presence of disturbances, namely changes in the number of clients. The approach is validated online on a real 40 node MapReduce cluster, running a data intensive Business Intelligence workload. Our experiments demonstrate that the designed control is successful in assuring service time constraints
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.