2020
DOI: 10.17775/cseejpes.2020.04080
|View full text |Cite
|
Sign up to set email alerts
|

A big data cleaning method based on improved CLOF and Random Forest for distribution network

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 0 publications
0
4
0
Order By: Relevance
“…The results achieved by the proposed method were presented in figures. Liu et al [29] improved the cluster-based local outlier factor (CLOF) and the random forest algorithm to impute missing data and detect outliers in batch processing. Datasets comprising data on electricity consumption, network loss rate, and active power were utilized to compare the results with those achieved by the old CLOF algorithm in terms of detecting outliers, and they were also compared to K-nearest neighbors, MICE, and missForest results in terms of imputing missing values.…”
Section: Machine Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…The results achieved by the proposed method were presented in figures. Liu et al [29] improved the cluster-based local outlier factor (CLOF) and the random forest algorithm to impute missing data and detect outliers in batch processing. Datasets comprising data on electricity consumption, network loss rate, and active power were utilized to compare the results with those achieved by the old CLOF algorithm in terms of detecting outliers, and they were also compared to K-nearest neighbors, MICE, and missForest results in terms of imputing missing values.…”
Section: Machine Learningmentioning
confidence: 99%
“…Furthermore, both Fitters et al [40] and Albattah et al [45] detected contextual outliers without tackling missing data. In addition, Liu et al [29] and Kulanuwat et al [54] both dealt with outliers before dealing with the missing value. However, missing values need to be tackled before detecting outliers, especially contextual outliers, because missing values may hide outliers and affect outlier detection Han et al [3].…”
Section: Rq2: Which Data Cleaning Issue Is Most Commonly Discussed Du...mentioning
confidence: 99%
“…After that, the missing values labelled with "−2" and the normalized data are combined to form the dataset. Choosing −2 is to distinguish missing values from the normalized data (Liu et al, 2020). In the second stage, we randomly sample from the dataset to train PCA-KMeans, GMM, and iForest algorithms to predict outliers using a soft voting mechanism.…”
Section: Problem Descriptionsmentioning
confidence: 99%
“…Furthermore, ref. [131] has implemented local outlier factor and random forest to clean big data related to distribution networks. As another example, an abnormal data cleaning approach related to wind turbines has been introduced in [132].…”
mentioning
confidence: 99%