2019
DOI: 10.1002/int.22193
|View full text |Cite
|
Sign up to set email alerts
|

From Big to Smart Data: Iterative ensemble filter for noise filtering in Big Data classification

Abstract: The quality of the data is directly related to the quality of the models drawn from that data. For that reason, many research is devoted to improve the quality of the data and to amend errors that it may contain. One of the most common problems is the presence of noise in classification tasks, where noise refers to the incorrect labeling of training instances. This problem is very disruptive, as it changes the decision boundaries of the problem. Big Data problems pose a new challenge in terms of quality data d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 15 publications
(3 citation statements)
references
References 26 publications
0
3
0
Order By: Relevance
“…Consequently, they proposed an ensemble filter based on C4.5, 1-NN, and linear machine. Similarly, some researchers [11], [12] proposed iterative partition filters that iteratively removed detected noise. In these methods, N filters were learned based on N group of N − 1 partitions and voted to detect noise on the entire dataset.…”
Section: Related Work a Data Cleaning Methodsmentioning
confidence: 99%
“…Consequently, they proposed an ensemble filter based on C4.5, 1-NN, and linear machine. Similarly, some researchers [11], [12] proposed iterative partition filters that iteratively removed detected noise. In these methods, N filters were learned based on N group of N − 1 partitions and voted to detect noise on the entire dataset.…”
Section: Related Work a Data Cleaning Methodsmentioning
confidence: 99%
“…values [11] and reducing redundant [12] and noisy data [13] to obtain quality data from big datasets. In addition, there are contributions as proposed by Liu et al [14], where the results are improved and the runtime reduced in classification problems by selecting the appropriate classification rule according to a given neighborhood, instead of using the complete dataset.…”
Section: As a Key Technique Capable Of Imputing Missingmentioning
confidence: 99%
“…With the rapid development of Internet of Things (IoTs) technologies, 2–5 the current explosive growth of data 6 makes the information overload more and more serious, which can be solved by recommender systems 7–10 . Generally, the recommender systems can be classified into three categories: collaborative filtering methods, content‐based methods, and hybrid methods 11 .…”
Section: Introductionmentioning
confidence: 99%