Proceedings of the Workshop on Hot Topics in Operating Systems 2021
DOI: 10.1145/3458336.3465299
|View full text |Cite
|
Sign up to set email alerts
|

Fail-slow fault tolerance needs programming support

Abstract: The need for fail-slow fault tolerance in modern distributed systems is highlighted by the increasingly reported fail-slow hardware/software components that lead to poor performance system-wide. We argue that fail-slow fault tolerance not only needs new distributed protocol designs, but also desires programming support for implementing and verifying fail-slow fault-tolerant code. Our observation is that the inability of tolerating fail-slow faults in existing distributed systems is often rooted in the implemen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 24 publications
0
2
0
Order By: Relevance
“…In [9], hybrid of traditional fault tolerance methods discuss for mobile distributed systems. Dependently fast library uses in [10] to implement the interface to prevent unexpected fail-slow tolerant distributed systems. A scheduling algorithm for cloud computing is proposed in [11] to minimize the response time of tasks present in backup after multiple failures of tasks.…”
Section: Background and Preliminariesmentioning
confidence: 99%
See 1 more Smart Citation
“…In [9], hybrid of traditional fault tolerance methods discuss for mobile distributed systems. Dependently fast library uses in [10] to implement the interface to prevent unexpected fail-slow tolerant distributed systems. A scheduling algorithm for cloud computing is proposed in [11] to minimize the response time of tasks present in backup after multiple failures of tasks.…”
Section: Background and Preliminariesmentioning
confidence: 99%
“…In [16] also the fault-tolerance concept implements in big data. Based on aforementioned work, it is clear that fault tolerance is not important in RTS only, but in every domain [9][10][11][12][13][14][15][16][17].…”
Section: Background and Preliminariesmentioning
confidence: 99%