2019 IEEE International Conference on Big Data (Big Data) 2019
DOI: 10.1109/bigdata47090.2019.9006151
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Hyperparameter Tuning Method for Local Outlier Factor, with Applications to Anomaly Detection

Abstract: In recent years, there have been many practical applications of anomaly detection such as in predictive maintenance, detection of credit fraud, network intrusion, and system failure. The goal of anomaly detection is to identify in the test data anomalous behaviors that are either rare or unseen in the training data. This is a common goal in predictive maintenance, which aims to forecast the imminent faults of an appliance given abundant samples of normal behaviors. Local outlier factor (LOF) is one of the stat… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
12
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 37 publications
(12 citation statements)
references
References 34 publications
0
12
0
Order By: Relevance
“…Ensemble based methods: The first ensemble learning approach to outlier detection runs on LOF when they are learned with different sets of hyperparameters such that the resultant combination is the anomaly scores [36]. Isolation Forest (IF) is another ensemble-based algorithm that builds a forest of random binary trees such that anomalous instances have short average path lengths on the trees [21; 22; 14].…”
Section: Related Workmentioning
confidence: 99%
“…Ensemble based methods: The first ensemble learning approach to outlier detection runs on LOF when they are learned with different sets of hyperparameters such that the resultant combination is the anomaly scores [36]. Isolation Forest (IF) is another ensemble-based algorithm that builds a forest of random binary trees such that anomalous instances have short average path lengths on the trees [21; 22; 14].…”
Section: Related Workmentioning
confidence: 99%
“…The performance of LOF is highly dependent on the values of contamination and n_neighbors [29]. We set the value of n_neighbors to 20, which is the default value of the utilized Machine Learning algorithm [27] and defines the number of neighbors that need to be taken into consideration to detect the outliers.…”
Section: Testing the Modelmentioning
confidence: 99%
“…The performance of an LOF algorithm depends on its parameters' values, contamination and neighborhood size [29]. When experimenting with synthetic data with the known anomaly portion, contamination value and neighborhood size can be tuned based on this known anomaly portion data and report better results.…”
Section: Testing the Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…All of above LOF-based algorithms determine whether the data point is an outlier by examining the outlier factor of the points in K -distance neighborhood, and they usually suffer from the high time overhead problem. In order to overcome this problem, various solutions have been proposed, including POLOF algorithm (Optimized Pruning-based Outlier Detecting algorithm), NLOF algorithm, IncLOF algorithm (Incremental Local Outlier Factor), INFLOF algorithm (Influenced Local Outlier Factor) and so on [29]- [31]. These algorithms combined clustering algorithm with the outlier detection algorithms to achieve better performance.…”
Section: Introductionmentioning
confidence: 99%