2017 IEEE 24th International Conference on High Performance Computing (HiPC) 2017
DOI: 10.1109/hipc.2017.00044
|View full text |Cite
|
Sign up to set email alerts
|

Enabling Dependability-Driven Resource Use and Message Log-Analysis for Cluster System Diagnosis

Abstract: Recent work have used both failure logs and resource use data separately (and together) to detect system failureinducing errors and to diagnose system failures. System failure occurs as a result of error propagation and the (unsuccessful) execution of error recovery mechanisms. Knowledge of error propagation patterns and unsuccessful error recovery is important for more accurate and detailed failure diagnosis, and knowledge of recovery protocols deployment is important for improving system reliability. This pa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…[12], [18]- [20]) and many have been evaluated on different benchmark logs' datasets in [21]. Moreover, various tools and much research have been dedicated to diagnosing the root causes of failures such as [22]- [24].…”
Section: Related Workmentioning
confidence: 99%
“…[12], [18]- [20]) and many have been evaluated on different benchmark logs' datasets in [21]. Moreover, various tools and much research have been dedicated to diagnosing the root causes of failures such as [22]- [24].…”
Section: Related Workmentioning
confidence: 99%
“…Therefore, it is required to follow the action of text data conversion into numerical vectors. In the existing literature, researchers applied indexed-based methods [19] [20] [21] or semantic-based methods [22][23] [24] to extract the features of logs. In the index-based extraction, log data is converted to log template indexes, and afterward, sequential or quantitative features are extracted against the generated indexes.…”
Section: Introductionmentioning
confidence: 99%