2019
DOI: 10.1109/tnet.2019.2938228
|View full text |Cite
|
Sign up to set email alerts
|

Understanding the Limits of Passive Realtime Datacenter Fault Detection and Localization

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
16
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 18 publications
(16 citation statements)
references
References 26 publications
0
16
0
Order By: Relevance
“…Machine learning techniques cover a range of unsupervised and supervised machine learning to identify and locate faults. Unsupervised learning includes detecting abrupt changes in network switch event counters by learning the normal values of those counters or by using outlier detection on TCP statistics of application flows [22]. Supervised learning techniques include using logistic regression to learn the mapping between network event data and fault classes [23] or using support vector machines, multilayer perceptrons, and random forests to learn the mapping between rate/delay/loss measures and fault classes [24].…”
Section: A Related Workmentioning
confidence: 99%
“…Machine learning techniques cover a range of unsupervised and supervised machine learning to identify and locate faults. Unsupervised learning includes detecting abrupt changes in network switch event counters by learning the normal values of those counters or by using outlier detection on TCP statistics of application flows [22]. Supervised learning techniques include using logistic regression to learn the mapping between network event data and fault classes [23] or using support vector machines, multilayer perceptrons, and random forests to learn the mapping between rate/delay/loss measures and fault classes [24].…”
Section: A Related Workmentioning
confidence: 99%
“…We simulated faulty scenarios using fault injection techniques. We injected fault types commonly used in the evaluation of state-of-the-art fault localization approaches [38,36,16,22]: packet loss, memory leak and CPU hog. For each fault we considered different severity growth patterns: (i) linear pattern, the fault is triggered with a same frequency over time, (ii) exponential pattern, the fault is activated with a frequency that increases exponentially, resulting in a shorter time to failure, (iii) random pattern, the fault is activated randomly over time.…”
Section: Investigated Faultsmentioning
confidence: 99%
“…deTector [8] presents an algorithm to minimize the number of probes sent for detecting and localizing packet losses and latency spikes. [13], [14] and [15] employ passive measurement for network faults localization. [13] presents a classification algorithm that identifies the root cause of failure using TCP statistics collected at one of the endpoints.…”
Section: Background and Related Workmentioning
confidence: 99%
“…[13] presents a classification algorithm that identifies the root cause of failure using TCP statistics collected at one of the endpoints. The work in [14] looks from the end-host to identify the faulty links and switches, by correlating anomalies in end-host statistics with the network path of the traffic. Vigil [15] tracks the path of TCP connections that display retransmissions through traceroute, and identifies the links with the most retransmissions as the faulty ones.…”
Section: Background and Related Workmentioning
confidence: 99%