Classification algorithms are very important for several fields such as data mining, machine learning, pattern recognition, and other data analysis applications. This work presents the weighted nearest neighbors and fuzzy k-nearest neighbors algorithms to classify chosen medical datasets. This involves several distance functions to calculate the difference between any two instances. Classification approaches based on K-nearest neighbors (KNN), weighted-KNN, frequency, class probability, and fuzzy K-nearest neighbors (fuzzy-KNN) are analyzed and discussed. Some measurable criteria are adopted to evaluate the performance of such algorithms. This includes classification accuracy, time, and confidence values. The algorithms will be tested using four different medical datasets. From the results, the fuzzy-KNN achieved the best accuracy compared to the other adopted algorithms. Following that are the weighted-KNN then the KNN. The longest classification time was for the fuzzy-KNN while the smallest time was for the KNN. The class confidence values of the fuzzy approach were promising. The fuzzy-KNN was also modified using fuzzy entropy. For the chosen datasets and w.r.t. KNN, the modified algorithms improved the classification accuracy. The improvements were up to 25%, 33%, and 38% for the weighted-KNN, fuzzy-KNN, and fuzzy Entropy respectively.
The main objective of clustering is to partition a set of objects into groups or clusters. The objects within a cluster are more similar to one another than those of the others clusters. This work analyzes, discusses and compares three clustering algorithms. The algorithms are based on partitioning, hierarchical, and swarm intelligence approaches. The three algorithms are k-means clustering, hierarchical agglomerative clustering, and ant clustering respectively. The algorithms are tested using three different datasets. Some measurable criteria are used for evaluating the performance of such algorithms. The criteria are: intra-cluster distance, intercluster distance, and clustering time. The experimental results showed that the k-means algorithm is faster and easily understandable than the other two algorithms. The k-means algorithm is not capable of determining the appropriate number of clusters and depends upon the user to identify this in advance. The ease of handling of any forms of similarity or distance is one of the advantages of the hierarchical clustering algorithm. The disadvantage involves the embedded flexibility regarding the granularity level. The ant-clustering algorithm can detect the more similar data for larger values of swarm coefficients. The performance of the ant clustering algorithm outperforms the other two algorithms. This occurs only for the better choice of the swarm parameters; otherwise the agglomerative hierarchical clustering is the best.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.