“…And many scholars, at home or from abroad, have proposed a number of methods to improve this algorithm and all these methods could be sorted into two categories, one is just based on the kNN algorithm and the other is integrating the kNN algorithm with other algorithms. In the first category, Jing et al presented a density-based method which clustered each class of sample data into several clusters and reduced the noise sample data and then combined some higher similar sample documents in each cluster into one document for reducing training data [2], Zhao et al proposed an essential vector based kNN algorithm which can keep the same accuracy with before while cutting out most of the samples [3], Bhattacharya et al put forward a new local distance function-affinity based new similarity function for kNN text classification [4], Sarma et al substituted Gaussian distribution for linear interpolation and based it on the weights to nearest neighbor [5]; while, in the second category, Ishii et al proposed a grouping method of similar words and combined it with latent semantic analysis and kNN algorithm to obtain higher accuracy in the classification [6], Zhou et al employed the hybrid model SVM-kNN algorithm which combined the advantages of SVM and kNN algorithms to improve the prediction accuracy of SVM [7], Jiang et al improved kNN algorithm on the basis of Bayesian-kNN and Citation-kNN [8].…”