High availability in cloud computing services is essential for maintaining customer confidence and avoiding revenue losses due to SLA violation penalties. Since the software and hardware components of cloud infrastructures may have limited reliability, fault tolerance mechanisms are a means of achieving the necessary dependability requirements. This paper investigates the benefits of a warmstandy replication mechanism in a Eucalyptus cloud computing environment. A hierarchical heterogeneous modeling approach is used to represent a redundant architecture and compare its availability to that of a non-redundant architecture. Both hardware and software failures are considered in the proposed analytical models. The results show an enhanced dependability for the proposed redundant system, as well as a decrease in the annual downtime. The results also demonstrate that the simple replacement of hardware by more reliable machines would not produce improvements in system availability to the same extent as would the fault tolerant approach.
The need for high reliability, availability and performance has significantly increased in modern applications, that handle rapidly growing demands while providing uninterruptible services. Cloud computing systems fundamentally provide access to large pools of data and computational resources. Eucalyptus is a software framework largely used to implement private clouds and hybrid-style Infrastructure as a Service. It implements the Amazon Web Service (AWS) API, allowing interoperability with other AWS-based services. This article investigates the software aging effects in the Eucalyptus framework, considering workloads composed of intensive requests for remote storage attachment and virtual machine instantiations. We found problems that may be harmful to system dependability and performance, specifically regarding to RAM memory and swap space exhaustion, besides highly excessive CPU utilization by the virtual machines. We also present an approach that applies time series analysis to schedule rejuvenation, so as to reduce the downtime by predicting the proper moment to perform the rejuvenation. We experimentally evaluate our approach using an Eucalyptus test bed. The results show that our approach achieves higher availability, when compared to a threshold-triggered rejuvenation method based on continuous monitoring of resources utilization.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.