In this article, a Hadoop-big data based chronic kidney disease prediction and classification using improved fractional rough fuzzy K-means (IF-RFKM) clustering and XG boost rat swarm optimizer is proposed. Here, IF-RFKM clustering method is contemplated for the disease prediction. This disease is classified using XG boost classifier for classifying the stages of chronic kidney diseases as normal and abnormal. Moreover, the rat swarm optimization (RSO) algorithm is proposed for optimizing the parameters of the XG boost classifier. Initially, the data is randomly generated from CKD dataset. The simulation is carried out in python language. From the simulation, the proposed method attains higher accuracy 99.57%, 98.28%, and 97.35%, higher recall 98.23%, 88.34%, and 78.96% and lower execution time 92.15%, 90.25%, and 92.48% compared with existing methods, like chronic kidney disease detection and classification by recursive feature elimination using decision tree (CCKD-RFE-DT). An efficient chronic kidney disease classification and clustering using logistic regression (CCKD-LR) and efficient chronic kidney disease classification utilizing multi-kernel support vector machine with (CCKD-MKC-SVM). K E Y W O R D Sbig data, chronic kidney diseases, improved fractional rough K means clustering, rat swarm optimization, XG boost classifier INTRODUCTIONNowadays, the technologies have developed with substantial changes that lead to the improvement of technologies with huge amounts of data. 1 If the data is not used, it becomes the amount of data that cannot be used. Thus, a quantity of unusable data can be used as a very valued data basis if they are appropriately administered using a method named data mining. 2 Data mining is used to discover stimulating outline from information concealed in a database. 3 By using machine learning techniques and statistical calculations, the discovery of pattern is prepared to construct models that prognosis the performance of data. 4 Classification is a technique for discovering patterns from data. A huge quantity of data are analyzed rapidly by data mining technique. 5 Actually, in the field of statistics, business, forecasting, and communication engineer's, data mining is not an advanced technique that utilizes patterns in data search for the purpose of prediction, verification, and identification. 6 Currently, many clinical databases have been developed based on the progress of the health database management system. By data mining techniques, composite databases are sustained easily. 7 Data mining
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.