improvement of the proposed method, several experiments were carried out using different real datasets. The presented results, which are achieved after extensive experiments, prove that the proposed algorithm improves the classification accuracy of KMeans. The achieved performance was also compared against several recent classification studies which are based on different classification schemes.
The K-Nearest Neighbor classifier is a well-known and widely applied method in data mining applications. Nevertheless, its high computation and memory usage cost makes the classical K-NN not feasible for today's Big Data analysis applications. To overcome the cost drawbacks of the known data mining methods, several distributed environment alternatives have emerged. Among these alternatives, Hadoop MapReduce distributed ecosystem attracted significant attention. Recently, several K-NN based classification algorithms have been proposed which are distributed methods tested in Hadoop environment and suitable for emerging data analysis needs. In this work, a new distributed Z-KNN algorithm is proposed, which improves the classification accuracy performance of the well-known K-Nearest Neighbor (K-NN) algorithm by benefiting from the representativeness relationship of the instances belonging to different data classes. The proposed algorithm relies on the data class representations derived from the Z data instances from each class, which are the closest to the test instance. The Z-KNN algorithm was tested in a physical Hadoop Cluster using several real-datasets belonging to different application areas. The performance results acquired after extensive experiments are presented in this paper and they prove that the proposed Z-KNN algorithm is a competitive alternative to other studies recently proposed in the literature
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.