2014 Sixth World Congress on Nature and Biologically Inspired Computing (NaBIC 2014) 2014
DOI: 10.1109/nabic.2014.6921892
|View full text |Cite
|
Sign up to set email alerts
|

Deleting or keeping outliers for classifier training?

Abstract: Abstract-This paper introduces two statistical outlier detection approaches by classes. Experiments on binary and multi-class classification problems reveal that the partial removal of outliers improves significantly one or two performance measures for C4.S and I-nearest neighbour classifiers. Also, a taxonomy of problems according to the amount of outliers is proposed.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 19 publications
(18 citation statements)
references
References 11 publications
0
18
0
Order By: Relevance
“…The findings indicated that omitting outliers from the training data considerably refined the kNN classifier. In a general case study, Tallon-Ballesteros and Riquelme evaluated the outlier effect in classification problems [48]. The study proposed a statistical outlier detection method to determine the outliers based on inter-quartile range (lQR) by classes.…”
Section: Outlier Detection Methodsmentioning
confidence: 99%
“…The findings indicated that omitting outliers from the training data considerably refined the kNN classifier. In a general case study, Tallon-Ballesteros and Riquelme evaluated the outlier effect in classification problems [48]. The study proposed a statistical outlier detection method to determine the outliers based on inter-quartile range (lQR) by classes.…”
Section: Outlier Detection Methodsmentioning
confidence: 99%
“…Previous studies showed that removing the outlier can improve the classification accuracy. Tallón-Ballesteros and Riquelme utilized outlier detection for a classification model [ 50 ]. The authors proposed a statistical outlier detection method based on the interquartile range (lQR) with classes.…”
Section: Literature Reviewmentioning
confidence: 99%
“…However, machine learning algorithms encounter problems with outlier data, which can reduce the accuracy of the classification model. Outlier detection can be applied to identify and remove outliers; thus, improving the performance of classification models [ 50 , 51 ]. One of the techniques used for outlier detection is Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [ 52 ].…”
Section: Introductionmentioning
confidence: 99%
“…A statistical outlier detection method based on the partial removal of outliers according the inter-quartile range of all the instances with the same class label was introduced in [22] with the name OUTLIERoutP. The framework can be reviewed in Figure 2 of the aforementioned work.…”
Section: Proposalmentioning
confidence: 99%
“…According to the literature, the researches tackle data cleansing or feature selection in an isolated way. Table 1 describes the data sets utilised together with the outlier level according to the taxonomy proposed in [22]. Most of them are publicly available in the UCI (University of California at Irvine) repository [4].…”
Section: Proposalmentioning
confidence: 99%