Failure monitoring and detection phase is a critical part in providing a scalability, reliability and high availability in current distributed environment. Heartbeat style of interaction is a widely used technique. This technique is utilized for detecting a fault where it monitors the heartbeats of system resources continuously in a very short interval. However, this approach has its limitations as it requires a period of time to detect the faulty node, causing delay in the impending recovery procedures. This paper presents a fault detection mechanism and service using hybrid heartbeat mechanism and dynamic estimated time of arrival (ETA) for each heartbeat message. This technique introduces the use of index server for indexing the transaction and operates dynamic hybrid heartbeat mechanism and pinging procedure for fault detection. The evaluation outcome signifies the use of the hybrid heartbeat mechanism in reducing approximately 30% of the time taken to detect faults compared to existing techniques and provides a basis for a customizable recovery action to take place.
Abstract-The tendency of current large distributed systems such Cloud computing is the delivery of computing as a service rather than a product. Availability is the most important properties in Cloud computing. One of central issues in Cloud environment is to provide reliable Infrastructure as a service (IaaS) with optimal availability. Since the resources or nodes become larger, increasingly dynamic and heterogeneous, the potential for failures in the systems is a significant disruptive factor. This paper proposes the twin co-existance neighbourhood (TCeN) model. It focuses on improving high availability in which it predicts future availability expectation of interdependent environment in a distributed system over an extended period of time.
Distributed system rely on replication techniques to tolerate data failure and site disconnection thus ensuring flexibility of the system so as to preserve its dependability. The idea of replication is robust however practical implementation of the replication techniques is often rigid that would bring down system's dependability and performance. This study intended to evaluate existing techniques and then develop a new technique which later will be compared with the existing with the goal of to achieve better fault tolerance, dependability and performance in distributed systems. The new technique is constructed based on circular neighbor relationship and quorum-based protocol. The consistency and integrity of the replicated data that involved write and read operations on the replicas is ensured using Replica maintenance protocol. This techniques focused on synchronous solution as for its quorum execution or commitment protocol showed higher reliability and convenience to avoid conflicts compared to asynchronous solution.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.