2021
DOI: 10.48550/arxiv.2105.09270
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Do We Really Need to Learn Representations from In-domain Data for Outlier Detection?

Abstract: Unsupervised outlier detection, which predicts if a test sample is an outlier or not using only the information from unlabelled inlier data, is an important but challenging task. Recently, methods based on the two-stage framework achieve state-of-the-art performance on this task. The framework leverages self-supervised representation learning algorithms to train a feature extractor on inlier data, and applies a simple outlier detector in the feature space. In this paper, we explore the possibility of avoiding … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(11 citation statements)
references
References 60 publications
0
11
0
Order By: Relevance
“…Baselines. We compare our methods with four existing baselines: Likelihood (Bishop, 1994), Input Complexity (Serrà et al, 2019), Likelihood Regret (Xiao et al, 2020), and Pretrained Feature Extractor + Mahalanobis Distance (Xiao et al, 2021). Likelihood is obtained from the DM using the implementation from Song et al (2020) 2 .…”
Section: Algorithm 1 Inpaintmentioning
confidence: 99%
“…Baselines. We compare our methods with four existing baselines: Likelihood (Bishop, 1994), Input Complexity (Serrà et al, 2019), Likelihood Regret (Xiao et al, 2020), and Pretrained Feature Extractor + Mahalanobis Distance (Xiao et al, 2021). Likelihood is obtained from the DM using the implementation from Song et al (2020) 2 .…”
Section: Algorithm 1 Inpaintmentioning
confidence: 99%
“…Specifically, algorithm selection is important in unsupervised AD, but limited works [8,67,119] have studied this. We may consider self-supervision [83,95,108] and transfer learning [20] to improve tabular AD as well. Thus, we suggest more focus on large-scale evaluation, task-driven algorithm selection, and data augmentation/transfer for unsupervised AD.…”
Section: Overall Model Performance On Real-world Datasets With Varyin...mentioning
confidence: 99%
“…[26] used the k-nearest neighbors distance between the test input and training set features as an anomaly score. [39] trained a GMM on the normal sample features, which could then identify anomalous samples as low probability areas. PANDA [2] attempts to project the pre-trained features of the normal distribution to another compact feature space employing the DSVDD [40] objective function.…”
Section: Comparisons With State-of-the-artmentioning
confidence: 99%