2010
DOI: 10.1007/s10462-010-9156-z
|View full text |Cite
|
Sign up to set email alerts
|

A study of the effect of different types of noise on the precision of supervised learning techniques

Abstract: Machine learning techniques often have to deal with noisy data, which may affect the accuracy of the resulting data models. Therefore, effectively dealing with noise is a key aspect in supervised learning to obtain reliable models from data. Although several authors have studied the effect of noise for some particular learners, comparisons of its effect among different learners are lacking. In this paper, we address this issue by systematically comparing how different degrees of noise affect four supervised le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

5
166
0
3

Year Published

2013
2013
2024
2024

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 369 publications
(208 citation statements)
references
References 18 publications
5
166
0
3
Order By: Relevance
“…Because there are numerous practical challenges, they cannot be simply treated as black-box (simply enter the input features). Some of the difficulties are: large dimensionality of feature vectors [50], bias/variance dilemma [51], input and output noise [52], large-scale training data [53], data heterogeneity [54], data redundancy [55] and non-linearity among features [56].…”
Section: Supervised Learningmentioning
confidence: 99%
“…Because there are numerous practical challenges, they cannot be simply treated as black-box (simply enter the input features). Some of the difficulties are: large dimensionality of feature vectors [50], bias/variance dilemma [51], input and output noise [52], large-scale training data [53], data heterogeneity [54], data redundancy [55] and non-linearity among features [56].…”
Section: Supervised Learningmentioning
confidence: 99%
“…In most applications, each parameter (or variable) will represent a column of the data matrix and the timestamp will be an additional variable. Sometimes, some pre-processing is required to integrate the information of several sensors in a single indicator (like mean water level of a bioreactor for example, measured by several sensors in different points) and the imprecision or uncertainty associated to the measurements must be properly treated at pre-processing level, in particular the noise associated to the signals [168]. Regarding imprecision, georadar, or GPS systems, for example, provide a region where the target can be located with a certain probability, but are not able to provide exact positions of the target instances.…”
Section: Building the Original Data Matrixmentioning
confidence: 99%
“…Domingos et al [32] found that NB performance is competitive with more sophisticated ML methods, such as DT, IBL, and rule induction, even if the features' dependency is very strong. Moreover, NB is a strongly noise-tolerant algorithm [33], [34]. Nettleton et al [33] performed a systematic analysis of robustness of many ML algorithms to noise-namely, NB, C4.5, IBk, and SMO.…”
Section: B Naїve Bayesian (Nb)mentioning
confidence: 99%