2017
DOI: 10.1016/j.jpdc.2017.02.003
|View full text |Cite
|
Sign up to set email alerts
|

Hybrid Message Pessimistic Logging. Improving current pessimistic message logging protocols

Abstract: With the growing scale of HPC applications, there has been an increase in the number of interruptions as a consequence of hardware failures. The remarkable decrease of Mean Time Between Failures (MTBF) in current systems encourages the research of suitable fault tolerance solutions. Message logging combined with uncoordinated checkpoint compose a scalable rollback-recovery solution. However, message logging techniques are usually responsible for most of the overhead during failure-free executions. Taking this … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 22 publications
0
11
0
Order By: Relevance
“…It is classified into two categories, synchronous and asynchronous, depending on when the always-no-orphans condition is ensured [3]. The first, also called pessimistic logging [4,5,7,9,10,14], forces each message to be logged as soon as it is received or before transmitting the message expected to be sent in the first place after the former message. In contrast, the second, called optimistic logging [11,12], allows logging each message to be delayed up to the favorable time to its receiver expecting the task will be successfully finished before a failure occurs.…”
Section: J Ahnmentioning
confidence: 99%
See 2 more Smart Citations
“…It is classified into two categories, synchronous and asynchronous, depending on when the always-no-orphans condition is ensured [3]. The first, also called pessimistic logging [4,5,7,9,10,14], forces each message to be logged as soon as it is received or before transmitting the message expected to be sent in the first place after the former message. In contrast, the second, called optimistic logging [11,12], allows logging each message to be delayed up to the favorable time to its receiver expecting the task will be successfully finished before a failure occurs.…”
Section: J Ahnmentioning
confidence: 99%
“…Sender-based message logging (SBML) is a lightweight synchronous one that performs volatile logging by saving each message recovery information in its sender's volatile memory while ensuring the always-no-orphans condition in case of sequential failures [4]. However, the inherent drawback of the conventional SBML protocols [4,5,7,9,10] is to require additional control message interactions for every application message to satisfy the condition. However, we found that the control message interaction overhead of the conventional SBML may highly be reduced if a distributed application repetitively exhibits a sequence of one way message exchange patterns to each process.…”
Section: J Ahnmentioning
confidence: 99%
See 1 more Smart Citation
“…Whenever an inter-group message m from another group is transmitted to a group, CS BML forces the group leader to always play the role of sender of the message as virtual sender for logging procedure and keeping all log information of the message in its buffer. Thanks to this feature, the protocol makes no extra inter-group control message that may be needed for traditional SBML [3]- [5]. Immediate dependent message logging of HML also requires no additional inter-group control message.…”
Section: Evaluation and Concluding Remarksmentioning
confidence: 99%
“…Message logging protocols have been presented as lightweight fault-tolerance technique to solve this problem [4]. Among them, in order to alleviate the high failure free overhead of receiver-based pessimistic message logging, sender-based message logging using volatile memory of its sender as storage for logging has been widely adopted [1], [3]- [5]. But, most of the previous sender-based message logging protocols [1], [3]- [5] commonly have message senders get receive sequence numbers(RSNs) of the messages from their receivers and confirm them with the receivers.…”
Section: Introductionmentioning
confidence: 99%