2003
DOI: 10.1007/3-540-44862-4_83
|View full text |Cite
|
Sign up to set email alerts
|

Empirical Evaluation of the Difficulty of Finding a Good Value of k for the Nearest Neighbor

Abstract: Abstract.As an analysis of the classification accuracy bound for the Nearest Neighbor technique, in this work we have studied if it is possible to find a good value of the parameter k for each example according to their attribute values. Or at least, if there is a pattern for the parameter k in the original search space. We have carried out different approaches based on the Nearest Neighbor technique and calculated the prediction accuracy for a group of databases from the UCI repository. Based on the experimen… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2012
2012
2022
2022

Publication Types

Select...
3
1
1

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…Tang [39] proposes a traffic prediction method for scaling resources in NFV environments based on traffic modeling with an Autoregressive Moving Average (ARMA); the predicted traffic values are obtained by minimizing MSE. Among the solutions based on the prediction of the resources to be allocated, Farahnakian [40] proposes regressive algorithms for estimating memory and processing consumption in cloud datacenters; the proposed solutions are based on Linear Regression [41] and K-Nearest Neighbor Regression (K-NNR) [42] methods that notoriously determine the prediction by minimizing symmetric error functions. A VNF migration algorithm is proposed and investigated in [43]; it is based on a deep belief network framework to predict the future resource requirements; the authors show how the proposed solution can obtain better estimates of CPU resources than a solution based on Back Propagation Neural Network [44] in terms of MSE.…”
Section: Related Work and Research Motivationmentioning
confidence: 99%
“…Tang [39] proposes a traffic prediction method for scaling resources in NFV environments based on traffic modeling with an Autoregressive Moving Average (ARMA); the predicted traffic values are obtained by minimizing MSE. Among the solutions based on the prediction of the resources to be allocated, Farahnakian [40] proposes regressive algorithms for estimating memory and processing consumption in cloud datacenters; the proposed solutions are based on Linear Regression [41] and K-Nearest Neighbor Regression (K-NNR) [42] methods that notoriously determine the prediction by minimizing symmetric error functions. A VNF migration algorithm is proposed and investigated in [43]; it is based on a deep belief network framework to predict the future resource requirements; the authors show how the proposed solution can obtain better estimates of CPU resources than a solution based on Back Propagation Neural Network [44] in terms of MSE.…”
Section: Related Work and Research Motivationmentioning
confidence: 99%
“…From an applied perspective, Ferrer-Troyano et al [6] present a comparison of k-nn over various UCI datasets, showing that finding a 'best' k can be difficult. They observe that larger values of k produce smaller errors on some datasets, but low values of k are similar on other datasets.…”
Section: Selecting K For K-nnmentioning
confidence: 99%
“…However, larger values of k tend to produce smoother models and are less sensitive to label noise. Ferrer-Troyano et al [6] show that for some data sets the prediction error varied greatly depending on the value selected for k. Thus, the choice of k must be carefully made for the task at hand.…”
Section: Introductionmentioning
confidence: 99%
“…In an earlier work [15], we have shown that by improving the distance measure and the similarity function, the performance of the kNN classifier can be increased significantly. However, a proper choice of k is also very crucial for better performance of the kNN classifier [3,16,19]. In this work, we propose a novel test point-specific k estimation strategy solely to improve the classification accuracy of the kNN classifier.…”
Section: Introductionmentioning
confidence: 99%
“…Performance of the kNN algorithm depends on several key factors including i) a suitable distance measure, ii) a similarity measure for voting, and, iii) an appropriate value/choice for the parameter k [14][15][16][17][18]. In an earlier work [15], we have shown that by improving the distance measure and the similarity function, the performance of the kNN classifier can be increased significantly.…”
Section: Introductionmentioning
confidence: 99%