2016
DOI: 10.48550/arxiv.1607.01152
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

How to Evaluate the Quality of Unsupervised Anomaly Detection Algorithms?

Abstract: When sufficient labeled data are available, classical criteria based on Receiver Operating Characteristic (ROC) or Precision-Recall (PR) curves can be used to compare the performance of unsupervised anomaly detection algorithms. However, in many situations, few or no data are labeled. This calls for alternative criteria one can compute on non-labeled data. In this paper, two criteria that do not require labels are empirically shown to discriminate accurately (w.r.t. ROC or PR based criteria) between algorithms… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 13 publications
0
7
0
Order By: Relevance
“…As observed in [15], this is an inherent problem in unsupervised anomaly detection algorithms. In the the following sections we address this lack by injecting artificial anomalies into a synthetic dataset and comparing this to the Tor Project's existing anomaly detection approach, as well as evaluating our method against an existing list of known filtering events.…”
Section: Validationmentioning
confidence: 95%
“…As observed in [15], this is an inherent problem in unsupervised anomaly detection algorithms. In the the following sections we address this lack by injecting artificial anomalies into a synthetic dataset and comparing this to the Tor Project's existing anomaly detection approach, as well as evaluating our method against an existing list of known filtering events.…”
Section: Validationmentioning
confidence: 95%
“…The top most layer is an application layer that offers various pre-built industrial templates to build reusable applications. In the case of un-supervised exploration, our system provides several ranking methods such as EM Score and AL Score (Goix 2016) that do not require explicit label information. In the case of semi-supervised exploration, the user provides a small amount of labeled data for obtaining the rank of each pipeline in the Workflow.…”
Section: Anomalykits : System Overviewmentioning
confidence: 99%
“…For anomaly detection problems, traditional ways of evaluating the system with Receiver Operating Characteristic (ROC) is not possible due to the absence of labeled data. In this paper, we use the ranked probability scores to evaluate an anomaly classification model ( [28], [29]).…”
Section: Performance Evaluationmentioning
confidence: 99%
“…Therefore, for a stable model and configuration setup, the corresponding scores are expected to produce relatively low values for the MV function and relatively high values for the EM function. The final performance criteria is computed based on the averages of the MV and the EM function under certain pre-specified domains for α and t [28]. In practice, we consider,…”
Section: Performance Evaluationmentioning
confidence: 99%