2020
DOI: 10.1108/ec-05-2019-0242
|View full text |Cite
|
Sign up to set email alerts
|

The synergistic combination of fuzzy C-means and ensemble filtering for class noise detection

Abstract: Purpose The purpose of this study is to enhance data quality and overall accuracy and improve certainty by reducing the negative impacts of the FCM algorithm while clustering real-world data and also decreasing the inherent noise in data sets. Design/methodology/approach The present study proposed a new effective model based on fuzzy C-means (FCM), ensemble filtering (ENS) and machine learning algorithms, called an FCM-ENS model. This model is mainly composed of three parts: noise detection, noise filtering … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
1
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
1
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(2 citation statements)
references
References 45 publications
(93 reference statements)
0
1
0
Order By: Relevance
“…The accuracy formula is applied to calculate the performance of the proposed technique in classification [24,48] using the confusion matrix. In the following formula, True Negative refers to correctly rejected samples, True Positive (TP) refers to correctly identified samples, False Positive refers to incorrectly identified samples and False Negative (FN) means incorrectly rejected samples:…”
Section: Performance Measuresmentioning
confidence: 99%
“…The accuracy formula is applied to calculate the performance of the proposed technique in classification [24,48] using the confusion matrix. In the following formula, True Negative refers to correctly rejected samples, True Positive (TP) refers to correctly identified samples, False Positive refers to incorrectly identified samples and False Negative (FN) means incorrectly rejected samples:…”
Section: Performance Measuresmentioning
confidence: 99%
“…H2O works better than RNN for transactional data because RNN is strong in sequential or time series data [22]. To improve performance of the model, ensemble approaches like Random Forests and Gradient Boosting Regression Trees [23], and Bagging [18], Smoothed Bootstrap Resampling [26], [27] could be used to reduce the negative effect of inherent noise [28] and overfitting. Li states that Random Forests and Gradient Boosting Regression Trees do not improve the prediction quality [23].…”
Section: Introductionmentioning
confidence: 99%