FAST Failure Detection Service for Large Scale Distributed Systems

Kalewski, Michał; Kobusińska, Anna; Kobusiński, Jacek

doi:10.1109/pdp.2009.33

Cited by 5 publications

(6 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This requires an appropriate fault detection algorithm, which can determine the new graph topology and, importantly, the new number of nodes. Such algorithms are typically executed intermittently, giving rise to a certain detection latency, .…”

Section: Fault Tolerant Implementationmentioning

confidence: 99%

Fault tolerant distributed portfolio optimization in smart grids

Juelsgaard

Wiśniewski

Bendtsen

2014

Intl J Robust & Nonlinear

View full text Add to dashboard Cite

SUMMARYThis work considers a portfolio of units for electrical power production and the problem of utilizing it to maintain power balance in the electrical grid. We treat the portfolio as a graph in which the nodes are distributed generators and the links are communication paths. We present a distributed optimization scheme for power balancing, where communication is allowed only between units that are linked in the graph. We include consumers with controllable consumption as an active part of the portfolio. We show that a suboptimal, but arbitrarily good power balancing, can be obtained in an uncoordinated, distributed optimization framework, and we argue that the scheme will work even if the computation time is limited. We further show that our approach can tolerate changes in the portfolio, in the sense that increasing or reducing the number of units in the portfolio requires only local updates. This ensures that units experiencing faults or need for maintenance can be removed from the graph without affecting the overall performance or convergence of the optimization. The results are illustrated by numerical case studies. Copyright © 2014 John Wiley & Sons, Ltd.

show abstract

Section: Fault Tolerant Implementationmentioning

confidence: 99%

Fault tolerant distributed portfolio optimization in smart grids

Juelsgaard

Wiśniewski

Bendtsen

2014

Intl J Robust & Nonlinear

View full text Add to dashboard Cite

show abstract

“…The decision which recovery point should be used is based on information gathered by SIM both from the MsgBuffer ← ∅ 38: end if service and from RMU. Details and rationale of this algorithm (l. [7][8][9][10][11][12][13] are beyond the scope of this paper and were provided in [6].…”

Section: Algorithm 1 Rollback-recovery Protocol -Data Typesmentioning

confidence: 99%

“…If an appropriate response is available, it is sent back to the client. If the request is already saved but there is no response, RMU orders CIM to repeat the request later, as the request is still executed by the service and the response is expected to arrive (l. [5][6][7][8][9][10][11]. If the received request is not yet processed, it is saved in RMU's stable storage, supplemented with the target service's identifier and epoch number, and directed to the service's SIM module (l. [12][13][14][15][16][17].…”

Section: Algorithm 1 Rollback-recovery Protocol -Data Typesmentioning

confidence: 99%

See 1 more Smart Citation

The Impact of Service Semantics on the Consistent Recovery in SOA

Hołenko

Kobusińska

Wawrzyniak

et al. 2014

2014 IEEE International Symposium on Parallel and Distributed Processing With Applications

Self Cite

View full text Add to dashboard Cite

This paper addresses a problem of consistent recovery of SOA processing. So far, the recovered state was considered as consistent if all events that have occurred before the failure were transparently recovered. However, providing such a strict consistency introduces a high performance overhead. Thus, we propose the semantic-based classification of services that enables to slack the notion of consistent recovered state from the viewpoint of services. We also present the extension of RESERVE rollback-recovery protocol that guarantees the proposed relaxed recovery consistency.

show abstract

“…The systems described in [5] and [6] use a randomized monitoring topology for failure detection. On the other hand, the one described in [7] a uses structured skip ring topology. We applied the findings reported in these papers not only for failure detection but also for developing an IP address stealing topology.…”

Section: Related Studiesmentioning

confidence: 99%

A Scalable Server Load Balancing Method Using IP Address Stealing

Toumura

Nemoto

2013

2013 IEEE 37th Annual Computer Software and Applications Conference Workshops

View full text Add to dashboard Cite

To deal with rapid traffic growth, service providers who offer their services through the Internet must make their service system scalable. To achieve scalability, service providers often use a load balancer on the front of their system. However, load balancers tend to be a bottleneck of the system or a single point of failure.To address this problem, we propose a novel load balancing method as a way to help achieve scalable services. To eliminate the bottleneck imposed by conventional load balancers, our method utilizes IP routing. Each server has a large number of IP addresses for providing client service. When load imbalance occurs, a lightly loaded server "steals" IP addresses from a heavily loaded server.A preliminary experiment carried out using a prototype implementation demonstrates that our method is effective and that randomized monitoring topology works better than ring topologies.

show abstract

FAST Failure Detection Service for Large Scale Distributed Systems

Cited by 5 publications

References 17 publications

Fault tolerant distributed portfolio optimization in smart grids

Fault tolerant distributed portfolio optimization in smart grids

The Impact of Service Semantics on the Consistent Recovery in SOA

A Scalable Server Load Balancing Method Using IP Address Stealing

Contact Info

Product

Resources

About