2021
DOI: 10.1007/s42979-021-00772-9
|View full text |Cite
|
Sign up to set email alerts
|

Two Class Pruned Log Message Anomaly Detection

Abstract: Log messages are widely used in cloud servers and other systems. Millions of logs are generated each day which makes them important for anomaly detection. However, they are complex unstructured text messages which makes this task difficult. In this paper, a hybrid log message anomaly detection technique is proposed which employs pruning of positive and negative logs. Reliable positive log messages are first selected using a Gaussian mixture model algorithm. Then reliable negative logs are selected using the K-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 44 publications
0
2
0
Order By: Relevance
“…-Failures [24], [34]- [36], [38]- [41], [55], [60], [64], [71], [73] OpenStack [20] 2017 Virtual machines Failures [20], [26], [34]- [36], [42], [44], [50], [52], [63], [76]- [78] Hadoop [90] 2016 High-perf. comp.…”
Section: Data Setmentioning
confidence: 99%
See 1 more Smart Citation
“…-Failures [24], [34]- [36], [38]- [41], [55], [60], [64], [71], [73] OpenStack [20] 2017 Virtual machines Failures [20], [26], [34]- [36], [42], [44], [50], [52], [63], [76]- [78] Hadoop [90] 2016 High-perf. comp.…”
Section: Data Setmentioning
confidence: 99%
“…2) Evaluation metrics: Quantitative evaluation (ER-2) of anomaly detection approaches typically revolves around counting the numbers of correctly detected anomalous samples as true positives (T P ), incorrectly detected non-anomalous samples as false positives (F P ), incorrectly undetected anomalous samples as false negatives (F N ), and correctly undetected non-anomalous samples as true negatives (T N ). In the most basic setting where events are labeled individually and samples represent single events (e.g., as in the BGL data set), it is relatively straightforward to evaluate detected events with binary classification [34], [36]. Some of the reviewed papers additionally consider a multi-class classification problem for data sets where different types of failures have distinct labels by computing the averages of evaluation metrics over all classes [55] or plotting confusion matrices [32].…”
Section: Data Setmentioning
confidence: 99%