2020
DOI: 10.3390/math8020286
|View full text |Cite
|
Sign up to set email alerts
|

A New K-Nearest Neighbors Classifier for Big Data Based on Efficient Data Pruning

Abstract: The K-nearest neighbors (KNN) machine learning algorithm is a well-known non-parametric classification method. However, like other traditional data mining methods, applying it on big data comes with computational challenges. Indeed, KNN determines the class of a new sample based on the class of its nearest neighbors; however, identifying the neighbors in a large amount of data imposes a large computational cost so that it is no longer applicable by a single computing machine. One of the proposed techniques to … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
39
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 84 publications
(40 citation statements)
references
References 23 publications
0
39
0
1
Order By: Relevance
“…However, if K = 5, two points in the neighborhood are in Class A, and three are in Class B, so the new data point will be classified as Class B. It follows that the choice of the value of K has a big impact on the accuracy of the trained model [58]. There is no specific way to determine the best K value, so it is necessary to try different values to find the best one.…”
Section: K-nearest Neighborsmentioning
confidence: 99%
“…However, if K = 5, two points in the neighborhood are in Class A, and three are in Class B, so the new data point will be classified as Class B. It follows that the choice of the value of K has a big impact on the accuracy of the trained model [58]. There is no specific way to determine the best K value, so it is necessary to try different values to find the best one.…”
Section: K-nearest Neighborsmentioning
confidence: 99%
“…There are many methods of analysis in the field of using the electronic nose in beekeeping, including linear discriminant analysis (LDA), principal component analysis (PCA), and cluster analysis (CA) with the furthest neighbor method (kNN). Good results have also been obtained using the artificial neural network (ANN) machine learning techniques, which use a neural network model based on a multilayer perceptron that learned using a backpropagation algorithm [8][9][10][11][12][13][14][15].…”
Section: Achievements To Date In the Use Of Gas Sensors For This Typementioning
confidence: 99%
“…It is based on the Bayesian system [24] and is used when the number of inputs is too big. It is mostly used in mathematics and statistical fields.…”
Section: A Native Bayes Algorithmmentioning
confidence: 99%