2018
DOI: 10.1002/cpe.4466
|View full text |Cite
|
Sign up to set email alerts
|

Large dataset summarization with automatic parameter optimization and parallel processing for local outlier detection

Abstract: Summary As one of the most important research problems of data analytics and data mining, outlier detection from large datasets has drawn many research attentions in recent years. In this paper, we investigate the interesting research problem of summarizing large datasets for supporting efficient local outlier detection. To summarize large datasets, efficient summarization algorithms are proposed that produce a highly compact summary of the original dataset, which can be applied to detect local outliers from f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 40 publications
0
2
0
Order By: Relevance
“…The proposed HOR method mainly composes of two stages, called; Fast Rejection (FR) stage and Accurate Rejection (AR) stage as shown in figure 5 . In FR stage, standard division is used as a statistical-based method to quickly reject outliers from the training dataset as possible [ 21 , 22 ]. In AR stage, Binary Gray Wolf Optimization (BGWO) method is used as an optimization technique to accurately remove the rest of outliers in the training data to improve the performance of the classification model [23] .…”
Section: The Proposed Covid-19 Prudential Expectation Strategy (Cpes)mentioning
confidence: 99%
“…The proposed HOR method mainly composes of two stages, called; Fast Rejection (FR) stage and Accurate Rejection (AR) stage as shown in figure 5 . In FR stage, standard division is used as a statistical-based method to quickly reject outliers from the training dataset as possible [ 21 , 22 ]. In AR stage, Binary Gray Wolf Optimization (BGWO) method is used as an optimization technique to accurately remove the rest of outliers in the training data to improve the performance of the classification model [23] .…”
Section: The Proposed Covid-19 Prudential Expectation Strategy (Cpes)mentioning
confidence: 99%
“…Similarly, Zhang et al apply Hadoop in speeding up the computation in power load forecast by clustering the spatial‐temporal load data. Shou and Li parallelize the process of local outlier detection in large data set summarization albeit Hadoop is not employed in this work.…”
mentioning
confidence: 99%