There are many methods for making a multicast protocol "reliable." At one end of the spectrum, a reliable multicast protocol might offer atomicity guarantees, such as all-ornothing delivery, delivery ordering, and perhaps additional properties such as virtually synchronous addressing. At the other are protocols that use local repair to overcome transient packet loss in the network, offering "best effort" reliability. Yet none of this prior work has treated stability of multicast delivery as a basic reliability property, such as might be needed in an internet radio, television, or conferencing application. This article looks at reliability with a new goal: development of a multicast protocol which is reliable in a sense that can be rigorously quantified and includes throughput stability guarantees. We characterize this new protocol as a "bimodal multicast" in reference to its reliability model, which corresponds to a family of bimodal probability distributions. Here, we introduce the protocol, provide a theoretical analysis of its behavior, review experimental results, and discuss some candidate applications. These confirm that bimodal multicast is reliable, scalable, and that the protocol provides remarkably stable delivery throughput. PREFACEEncamped on the hilltops overlooking the enemy fortress, the commanding General prepared for the final battle of the campaign. Given the information he was gathering about enemy positions, his forces could prevail. Indeed, if most of his observations could be communicated to most of his forces the battle could be won even if some reports reached none or very few of his troops. But if many reports failed to get through, or reached many but not most of his commanders, their attack would be uncoordinated and the battle lost, for only he was within direct sight of the enemy, and in the coming battle strategy would depend critically upon the quality of the information at hand.Although the General had anticipated such a possibility, his situation was delicate. As the night wore on, he dispatched wave upon wave of updates on the enemy troop placements. Some couriers perished in the dark, wet forests separating the camps. Worse still, some of his camps were beset by the disease that had ravaged the allies since the start of the campaign. They could not be relied upon, as chaos and death ruled there.With the approach of dawn, the General sat sipping coffee-rotgut stuffreflectively. In the night, couriers came and went, following secret protocols worked out during the summer. At the appointed hour, he rose to lead the attack. The General was not one to shirk a calculated risk.
The design and correctness of a communication facility for a distributed computer system are reported on. The facility provides support for fault-tolerant process groups in the form of a family of reliable multicast protocols that can be used in both local-and wide-area networks. These protocols attain high levels of concurrency, while respecting application-specific delivery ordering constraints, and have varying cost and performance that depend on the degree of ordering desired. In particular, a protocol that enforces causal delivery orderings is introduced and shown to be a valuable alternative to conventional asynchronous communication protocols. The facility also ensures that the processes belonging to a fault-tolerant process group will observe consistent orderings of events affecting the group as a whole, including process failures, recoveries, migration, and dynamic changes to group properties like member rankings. A review of several uses for the protocols in the ISIS system, which supports fault-tolerant resilient objects and bulletin boards, illustrates the significant simplification of higher level algorithms made possible by our approach.
The Isis toolkit is a distributed programming environment based on support for virtually synchronous process groups and group communication.We present a new suite of protocols in support of this model. Our approach revolves around a muiticast primitive, called CBCAST, which implements a fault-tolerant, causally ordered message delivery. This primitive can be used directly, or extended into a totally ordered multicast primitive, called ABCAST. It normally delivers messages immediately upon reception, and imposes a space overhead proportional to the size of the groups to which the sender belongs, usually a small number. We conclude that process groups and group communication can achieve performance and scaling comparable to that of a raw message transport layer -a finding contradicting the widespread concern that this style of distributed computing may be unacceptably costly.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.