Fault-Tolerant Aggregation by Flow Updating

Jesus, Paulo; Baquero, Carlos; Almeida, Paulo Sérgio

doi:10.1007/978-3-642-02164-0_6

Cited by 19 publications

(35 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…While the search for efficient distributed aggregation algorithms has received a lot of attention (cf. [5,7]), non-trivial fault tolerance aspects were only addressed recently [1,2,4,8]. Fault tolerance at the aggregation level is inherently challenging, since it requires the development of strong fault tolerance techniques solely working at the algorithmic level.…”

Section: Fault Tolerance Aspectsmentioning

confidence: 99%

“…The subsequent discussion of these algorithms can obviously only cover the fundamental properties and different approaches of the algorithms. For a detailed exposition we refer to the literature [1][2][3][4][5].…”

Section: Fault Tolerance Aspectsmentioning

confidence: 99%

“…In case we set in our simulations a target precision τ and an algorithm does not reach it, i. e., E max (t) > τ for all times t, we abort the computation after a predefined maximal number of iterations (usually 1000), and report the accuracy measures for the last completed iteration as achieved accuracy. While the usage of E max as measure for the achieved accuracy guarantees that all nodes reached the prescribed target accuracy, existing experimental work (see, e. g., [1,2]) often provides measures like the mean square error (MSE) [2] or the root mean square error (RMSE) [1] where no accuracy guarantees for a single node can be given. Such differences in the accuracy evaluation are fundamental and have a big influence on the computed results and their interpretation.…”

Section: Objectives and Evaluation Proceduresmentioning

confidence: 99%

“…In the course of this, we aim for a comparison of the strengths and weaknesses of existing algorithms as well as for an analysis how their theoretically predicted behavior is affected in practice. In this paper, we consider the recently proposed fault tolerant aggregation algorithms Flow-Updating [1], LiMoSense [2], Push-Flow [3] and the novel Push-Cancel-Flow [4] (cf. Section 2).…”

Section: Introductionmentioning

confidence: 99%

“…In contrast to LiMoSense which extends the PS algorithm by fault tolerance mechanisms, Flow-Updating [1] is an entirely independent approach. It is based on the following two ideas: (i) a node computes its local estimate of the aggregate as average over the most recent estimates it got from its neighbors.…”

mentioning

confidence: 99%

See 4 more Smart Citations

Robust Gossip-Based Aggregation: A Practical Point of View

Niederbrucker

Gansterer

2013

2013 Proceedings of the Fifteenth Workshop on Algorithm Engineering and Experiments (ALENEX)

View full text Add to dashboard Cite

Over the last years, several gossip-based aggregation algorithms have been developed which focus on providing resilience in failure-prone distributed systems. The main objective of such algorithms is the efficient in-network computation of aggregates even in the case when system failures occur during runtime. In this paper, we evaluate performance and limitations in practical computations of those gossip-based aggregation algorithms with the most promising theoretical fault tolerance properties.Theoretical analyses of these algorithms usually address only the principal ability of handling or overcoming a certain kind of system failure. Most of the time, there are no formal results on the concrete impact of failure handling on the performance of the algorithms, e. g., in terms of convergence speed. This leaves a wide gap between theory and practice, as we illustrate in this paper. In order to bridge this gap, we first categorize common system failures of interest. Then, we experimentally investigate how well these common failure types are handled in practice by the considered algorithms and up to which extent these state-of-the-art methods provide a reasonable degree of fault tolerance in practice. Our experimental studies reveal (i) that certain failure handling approaches which work in theory exhibit unacceptable performance in practice and (ii) that in some cases the failure handling mechanisms used introduce new problems, e. g., numerical inaccuracy.Our investigations illustrate that for some failure types (such as permanent node failures) further algorithmic advances are required to achieve resilience with a reasonably small overhead and acceptable performance.

show abstract

Section: Fault Tolerance Aspectsmentioning

confidence: 99%

Section: Fault Tolerance Aspectsmentioning

confidence: 99%

Section: Objectives and Evaluation Proceduresmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

mentioning

confidence: 99%

See 3 more Smart Citations

Robust Gossip-Based Aggregation: A Practical Point of View

Niederbrucker

Gansterer

2013

2013 Proceedings of the Fifteenth Workshop on Algorithm Engineering and Experiments (ALENEX)

View full text Add to dashboard Cite

show abstract

A comparative study of spanning tree and gossip protocols for aggregation

Nyers

Jelasity

2015

Concurrency and Computation

View full text Add to dashboard Cite

Distributed aggregation queries like average and sum can be implemented in different paradigms like gossip and hierarchical approaches. In the literature, these two paradigms are routinely associated with stereotypes such as "trees are fragile and complicated" and "gossip is slow and expensive". However, a closer look reveals that these statements are not backed up by systematic studies. A fair and informative comparison is clearly needed. However, this is a hard task because the performance of protocols from the two paradigms depends on different subtleties of the environment and the implementation of the protocols. We tackle this problem by carefully designing the comparison study. We use state-of-the-art algorithms and propose the problem of monitoring the network size in the presence of churn as the ideal problem for comparing very different paradigms for global aggregation. Our simulation study helps us identify the most important factors that differentiate between gossip and spanning tree aggregation: the time needed to compute a truly global output, the properties of the underlying topology, and sensitivity to dynamism. We demonstrate the effect of these factors in different practical topologies and scenarios. Our results help us to choose the right protocol in the light of the topology and dynamism patterns. IntroductionFully distributed aggregation is an important problem where we wish to execute queries such as sum, average, minimum, or maximum over unreliable networks (sensor networks, physical networks of routers, overlay networks, etc.), in which no central servers are directly accessible.At least two paradigms are known that solve this problem. The first one is the gossip approach where algorithms were proposed to achieve large degrees of robustness. Gossip protocols do not rely on fixed topologies: nodes exchange information with random neighbors to implement a diffusion-like computation pattern, and as a result the system converges to a state where all the nodes know the query result. From the literature, here we just focus on the adaptive approaches. In [1], the authors propose the restarting technique to convert any one-shot algorithm into an adaptive one. Apart from restarting, other approaches have been proposed that focus on error correction through some form of bookkeeping at the nodes [2][3][4][5].The second paradigm is hierarchical aggregation, which is a popular method in sensor networks [6]. It was also proposed for general process groups [7]. Tree-based aggregation remained unpopular in some areas like peer-to-peer networks due to the widely held assumptions about its lack of robustness. There are a few notable exceptions however: the Astrolabe framework [8], which is in fact only a virtual tree with completely unstructured gossip communication patterns behind it; the GAP protocol and its variants [5, 9-11] that actually build a spanning tree over a distributed network; and PRISM [12], a hierarchical approach that is built on top of a distributed hashtable, with a focus on dete...

show abstract

Spanning Tree or Gossip for Aggregation: A Comparative Study

Nyers

Jelasity

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. Distributed aggregation queries like average and sum can be implemented in several different paradigms including gossip and hierarchical approaches. In the literature, these two paradigms are routinely associated with stereotypes such as "trees are fragile and complicated" and "gossip is slow and expensive". However, a closer look reveals that these statements are not backed up by thorough studies. A fair and informative comparison is clearly needed. However, it is a very hard task, because the performance of protocols from the two paradigms depends on different subtleties of the environment and the implementation of the protocols. We tackle this problem by carefully designing the comparison study. We use state-of-the-art algorithms and propose the problem of monitoring the network size in the presence of churn as the ideal problem for comparing very different paradigms for global aggregation. Our experiments help us identify the most important factors that differentiate between gossip and spanning tree aggregation: the time needed to compute a truly global output, the properties of the underlying topology, and the sensitivity to dynamism. We demonstrate the effect of these factors in different practically interesting topologies and scenarios. Our results help us to choose the right protocol in the knowledge of the topology and dynamism patterns.

show abstract

Fault-Tolerant Aggregation by Flow Updating

Cited by 19 publications

References 20 publications

Robust Gossip-Based Aggregation: A Practical Point of View

Robust Gossip-Based Aggregation: A Practical Point of View

A comparative study of spanning tree and gossip protocols for aggregation

Spanning Tree or Gossip for Aggregation: A Comparative Study

Contact Info

Product

Resources

About