2011 IEEE 30th International Symposium on Reliable Distributed Systems 2011
DOI: 10.1109/srds.2011.28
|View full text |Cite
|
Sign up to set email alerts
|

netCSI: A Generic Fault Diagnosis Algorithm for Large-Scale Failures in Computer Networks

Abstract: In this paper we present a framework and a set of algorithms for determining faults in networks when large scale outages occur. The design principles of our algorithm, netCSI, are motivated by the fact that failures are geographically clustered in such cases. We address the challenge of determining faults with incomplete symptom information due to a limited number of reporting nodes in the network. netCSI consists of two parts: hypotheses generation algorithm, and ranking algorithm. When constructing the hypot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
7
0

Year Published

2012
2012
2018
2018

Publication Types

Select...
3
3

Relationship

3
3

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 11 publications
0
7
0
Order By: Relevance
“…In this context, there can be three possible kinds of information: cluster information, object distance information (OD), and no information (NI). We do not describe the conditional failure probability model with no information (CFPM-NI) here because it is straightforward, and details can be found in [21].…”
Section: Conditional Failure Probability Modelsmentioning
confidence: 99%
“…In this context, there can be three possible kinds of information: cluster information, object distance information (OD), and no information (NI). We do not describe the conditional failure probability model with no information (CFPM-NI) here because it is straightforward, and details can be found in [21].…”
Section: Conditional Failure Probability Modelsmentioning
confidence: 99%
“…However, there is considerable interest on other various aspects of large-scale failures in current literature [11], [12]. Recently, netCSI [13], a combinatorial based algorithm is proposed to diagnose large-scale failures. However, there is a limitation of run-time in large networks.…”
Section: Related Workmentioning
confidence: 99%
“…During a series of failues that include both independent and clustered, AMC results in a reduced number of false negatives and false positives. [13] is proposed to localize largescale failures in networks. It is shown that by considering the failure patterns of large-scale outages, this algorithm can achieve higher accuracy than existing algorithms developed for independent failures [10].…”
mentioning
confidence: 99%
See 2 more Smart Citations