We determine the weakest failure detectors to solve several fundamental problems in distributed message-passing systems, for all environments -i.e., regardless of the number and timing of crashes. The problems that we consider are: implementing an atomic register, solving consensus, solving quittable consensus (a variant of consensus in which processes have the option to decide 'quit' if a failure occurs), and solving non-blocking atomic commit.
In the population protocol model introduced by Angluin et al.[2], a collection of agents, which are modelled by finite state machines, move around unpredictably and have pairwise interactions. The ability of such systems to compute functions on a multiset of inputs that are initially distributed across all of the agents has been studied in the absence of failures. Here, we show that essentially the same set of functions can be computed in the presence of halting and transient failures, provided preconditions on the inputs are added so that the failures cannot immediately obscure enough of the inputs to change the outcome. We do this by giving a general-purpose transformation that makes any algorithm for the fault-free setting tolerant to failures.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.