An Improved Density-Based Method for Reducing Training Data in KNN

Jing, Yongxia; Gou, Heping; Zhu, Youwen

doi:10.1109/iccis.2013.261

Cited by 10 publications

(3 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the KNN algorithm, researchers have made many improvements. In 2013, Zhu et al [6] proposed an improvement method based on density. is method reduces the training data and the computational cost of the KNN algorithm by a way of merging.…”

Section: Related Workmentioning

confidence: 99%

Improved KNN Algorithm Based on Preprocessing of Center in Smart Cities

Wang

Zhao

2021

Complexity

View full text Add to dashboard Cite

The KNN algorithm is one of the most famous algorithms in machine learning and data mining. It does not preprocess the data before classification, which leads to longer time and more errors. To solve the problems, this paper first proposes a PK-means++ algorithm, which can better ensure the stability of a random experiment. Then, based on it and spherical region division, an improved KNNPK+ is proposed. The algorithm can select the center of the spherical region appropriately and then construct an initial classifier for the training set to improve the accuracy and time of classification.

show abstract

Section: Related Workmentioning

confidence: 99%

Improved KNN Algorithm Based on Preprocessing of Center in Smart Cities

Wang

Zhao

2021

Complexity

View full text Add to dashboard Cite

show abstract

“…And many scholars, at home or from abroad, have proposed a number of methods to improve this algorithm and all these methods could be sorted into two categories, one is just based on the kNN algorithm and the other is integrating the kNN algorithm with other algorithms. In the first category, Jing et al presented a density-based method which clustered each class of sample data into several clusters and reduced the noise sample data and then combined some higher similar sample documents in each cluster into one document for reducing training data [2], Zhao et al proposed an essential vector based kNN algorithm which can keep the same accuracy with before while cutting out most of the samples [3], Bhattacharya et al put forward a new local distance function-affinity based new similarity function for kNN text classification [4], Sarma et al substituted Gaussian distribution for linear interpolation and based it on the weights to nearest neighbor [5]; while, in the second category, Ishii et al proposed a grouping method of similar words and combined it with latent semantic analysis and kNN algorithm to obtain higher accuracy in the classification [6], Zhou et al employed the hybrid model SVM-kNN algorithm which combined the advantages of SVM and kNN algorithms to improve the prediction accuracy of SVM [7], Jiang et al improved kNN algorithm on the basis of Bayesian-kNN and Citation-kNN [8].…”

Section: Introductionmentioning

confidence: 99%

Irregular Partitioning Method Based k-Nearest Neighbor Query Algorithm Using MapReduce

Zhang¹,

Li²,

He³

et al. 2015

Proceedings of the 2015 International Symposium on Computers &Amp; Informatics

View full text Add to dashboard Cite

With the dramatic increase of available data, the process of data processing should get higher and higher performance. Most researches on k-Nearest Neighbor (kNN) query algorithm are based on the regular partitioning method which is easy to cause the imbalance of load, even influence the overall performance of the kNN query algorithm. In addition, the traditional kNN query algorithm works on single process or single machine platforms, which cannot obtain high enough efficiency when dealing with big data. Aiming at these two issues, an irregular partitioning method based kNN algorithm is presented and being executed on the distributed parallel computing platform-MapReduce as of in this paper. Experiments show that the irregular partitioning method based kNN algorithm using MapReduce can obtain much higher performance and can guarantee a very efficient query when dealing with big data.

show abstract

“…At the worst case, if all training data are used in final algorithm decision, it will certainly spend more time. Also, more memory is required to store all training data (Chang and Liu, 2011, Jing et al, 2013, Suguna and Thanushkodi, 2010.…”

mentioning

confidence: 99%

Data mining and machine learning: an Overview of Classifiers

Haghighi¹

2015

CeN

View full text Add to dashboard Cite

At the same time of information age, digital revolution has made necessary using some of technologies to analyze most of essential information. Data mining is a technique to make sense to the available data. The aim of data mining is extracting the information from a vast volume of data and transforming them into a comprehensible form for human. For this purpose, machine learning methods are used to classify data.In this study, we discuss six popular and useful classifiers in the data mining process.

show abstract

An Improved Density-Based Method for Reducing Training Data in KNN

Cited by 10 publications

References 2 publications

Improved KNN Algorithm Based on Preprocessing of Center in Smart Cities

Improved KNN Algorithm Based on Preprocessing of Center in Smart Cities

Irregular Partitioning Method Based k-Nearest Neighbor Query Algorithm Using MapReduce

Data mining and machine learning: an Overview of Classifiers

Contact Info

Product

Resources

About