2020
DOI: 10.3390/s20205829
|View full text |Cite
|
Sign up to set email alerts
|

TADILOF: Time Aware Density-Based Incremental Local Outlier Detection in Data Streams

Abstract: Outlier detection in data streams is crucial to successful data mining. However, this task is made increasingly difficult by the enormous growth in the quantity of data generated by the expansion of Internet of Things (IoT). Recent advances in outlier detection based on the density-based local outlier factor (LOF) algorithms do not consider variations in data that change over time. For example, there may appear a new cluster of data points over time in the data stream. Therefore, we present a novel algorithm f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 17 publications
(7 citation statements)
references
References 30 publications
0
7
0
Order By: Relevance
“…In 7 of these data sets (Hepatitis, Biomed, Heart Disease, Boston_HP, Breast_O, SpamBase, and Satimage), the ROC-AUC score rises with the increase of k. On the other hand, according to the experimental results in [3], LOF values stabilize after k = 10. Therefore, the value of k should be chosen to be at least 10 in order to be less affected by fluctuations in the LOF value caused by k. In addition, the process time should be taken into account, since choosing the value of k too large will require high process time (cost) [45], [46]. In view of these facts, it would be a good option to choose the smallest k value greater than 10 according to M T = 0.50 to obtain the highest (or near) ROC-AUC score.…”
Section: E Detailed Analyses Of Kmentioning
confidence: 99%
“…In 7 of these data sets (Hepatitis, Biomed, Heart Disease, Boston_HP, Breast_O, SpamBase, and Satimage), the ROC-AUC score rises with the increase of k. On the other hand, according to the experimental results in [3], LOF values stabilize after k = 10. Therefore, the value of k should be chosen to be at least 10 in order to be less affected by fluctuations in the LOF value caused by k. In addition, the process time should be taken into account, since choosing the value of k too large will require high process time (cost) [45], [46]. In view of these facts, it would be a good option to choose the smallest k value greater than 10 according to M T = 0.50 to obtain the highest (or near) ROC-AUC score.…”
Section: E Detailed Analyses Of Kmentioning
confidence: 99%
“…Then we compute LOF score of p i according to equations 1, 2, and 3 and add p i to the set of outliers, O, if LOF score of p i is greater than LOF threshold, T (lines 7-10). At the same time, we update LOF score of each data point p j in the reverse neighbour set of p i and if the data point p j transformed from outlier to inlier, we remove it from O (lines [11][12][13][14][15][16]…”
Section: B Summarization Stepmentioning
confidence: 99%
“…The communication cost of these networks is also an essential factor. There are many approaches of outlier detection over data stream such as clustering based outlier detection [4],statistical based outlier detection [5], [6], distance based outlier detection [7], [8], [9], [10] [11], and density based outlier detection [12], [13], [14] [15], [16]. In this paper,we are interested in the density-based outlier detection over data streams.…”
Section: Introductionmentioning
confidence: 99%
“…In [2], a new algorithmic rule for the purpose of streaming data, referred to as "time-aware density-based incremental local outlier detection", was proposed to conquer variations in data that change as time goes on. The results show that the proposed "time-aware density-based incremental local outlier detection" performs better than that of the existing candidates in the sense of the AUC in most of the cases on different kinds of datasets.…”
Section: Summary Of the Special Issuementioning
confidence: 99%