To detect deadlock in distributed systems, the initiator should construct an efficient explicit or implicit global wait-for graph. In this paper, we present an unstructured deadlock detection algorithm using a gossip protocol in cloud computing environments, where constituting nodes may join and leave at any time. Because of the inherit properties of a gossip protocol, we argue that our proposed deadlock detection algorithm is scalable, fault-tolerant, and efficient, retaining safety and liveness properties. The correctness proof of the algorithm is also provided. The message complexity of our proposed algorithm is O(n), where n is the number of nodes. Our performance evaluation with scalable settings shows that our approach has a significant advantage over previous deadlock detection algorithms in terms of solving scalability, fault-tolerance, and complexity-efficiency issues.UNSTRUCTURED DEADLOCK DETECTION TECHNIQUE 853 cycle (round), every node in the system selects f (fanout) number of nodes at random and then communicates using either (i) push; (ii) pull; or (iii) push-pull mode. Gossip-based algorithms guarantee message delivery to all nodes with high probability; their variations can be found in [13][14][15][16][17]. Applications of gossip-based algorithms include message dissemination, failure detection services, data aggregation, and clock synchronization [18,19].In this paper, we present an unstructured deadlock detection algorithm based on the gossip-based algorithm. An approach that applies a gossip-based algorithm for resolving the deadlock detection problem is desirable because of the ability of this type of algorithm to deal with scalability and dynamic behavior in cloud computing systems. In the gossip-based algorithm, the essential key to resolving the scalability issue is that each node has only a local view. In other words, each node does not have to maintain all the nodes in the system but only a small number of nodes, and the overlay network is greatly simplified.The remainder of the paper is organized as follows. We present the system model and formally describe the deadlock detection problem in Section 2. In Section 3, we provide the basic idea and descriptions of our unstructured deadlock detection algorithm that uses gossip communication. Evaluation results for the algorithm and their interpretation are given in Section 4; this section also analyzes the message complexity and proves the correctness properties of the algorithm. After reviewing some related studies in Section 5, we present our conclusions in Section 6.
SYSTEM MODEL AND PROBLEM STATEMENTS
System modelWe assume that the cloud computing infrastructure consists of numerous nodes of resources and individual nodes process arbitrary programs to achieve a common goal. Because of the absence of shared memory, each process or node should communicate with other nodes only by passing messages through a set of channels. In addition, we assume that all the channels are reliable but not restricted to FIFO, meaning messages within a channe...