2022
DOI: 10.3390/app12073673
|View full text |Cite
|
Sign up to set email alerts
|

Exploring Early Prediction of Chronic Kidney Disease Using Machine Learning Algorithms for Small and Imbalanced Datasets

Abstract: Chronic kidney disease (CKD) is a worldwide public health problem, usually diagnosed in the late stages of the disease. To alleviate such issue, investment in early prediction is necessary. The purpose of this study is to assist the early prediction of CKD, addressing problems related to imbalanced and limited-size datasets. We used data from medical records of Brazilians with or without a diagnosis of CKD, containing the following attributes: hypertension, diabetes mellitus, creatinine, urea, albuminuria, age… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(11 citation statements)
references
References 44 publications
0
6
0
1
Order By: Relevance
“…Hence, the proposed approach is compared with the following methods: a probabilistic neural network (PNN) [ 66 ], an enhanced sparse autoencoder (SAE) neural network [ 10 ], a naïve Bayes (NB) classifier with feature selection [ 67 ], a feature selection method based on cost-sensitive ensemble and random forest [ 3 ], a linear support vector machine (LSVM) and synthetic minority oversampling technique (SMOTE) [ 11 ], a cost-sensitive random forest [ 68 ], a feature selection method based on recursive feature elimination (RFE) and artificial neural network (ANN) [ 69 ], a correlation-based feature selection (CFS) and ANN [ 69 ]. The other methods include optimal subset regression (OSR) and random forest [ 9 ], an approach to identify the essential CKD features using improved linear discriminant analysis (LDA) [ 13 ], a deep belief network (DBN) with Softmax classifier [ 70 ], a random forest (RF) classifier with feature selection (FS) [ 71 ], a model based on decision tree and the SMOTE technique [ 12 ], a logistic regression (LR) classifier with recursive feature elimination (RFE) technique [ 14 ], and an XGBoost model with a feature selection approach combining the extra tree classifier (ETC), univariate selection (US), and RFE [ 15 ].…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Hence, the proposed approach is compared with the following methods: a probabilistic neural network (PNN) [ 66 ], an enhanced sparse autoencoder (SAE) neural network [ 10 ], a naïve Bayes (NB) classifier with feature selection [ 67 ], a feature selection method based on cost-sensitive ensemble and random forest [ 3 ], a linear support vector machine (LSVM) and synthetic minority oversampling technique (SMOTE) [ 11 ], a cost-sensitive random forest [ 68 ], a feature selection method based on recursive feature elimination (RFE) and artificial neural network (ANN) [ 69 ], a correlation-based feature selection (CFS) and ANN [ 69 ]. The other methods include optimal subset regression (OSR) and random forest [ 9 ], an approach to identify the essential CKD features using improved linear discriminant analysis (LDA) [ 13 ], a deep belief network (DBN) with Softmax classifier [ 70 ], a random forest (RF) classifier with feature selection (FS) [ 71 ], a model based on decision tree and the SMOTE technique [ 12 ], a logistic regression (LR) classifier with recursive feature elimination (RFE) technique [ 14 ], and an XGBoost model with a feature selection approach combining the extra tree classifier (ETC), univariate selection (US), and RFE [ 15 ].…”
Section: Resultsmentioning
confidence: 99%
“…Furthermore, Silveira et al [ 12 ] developed a CKD prediction approach using a variety of resampling techniques and ML algorithms. The resampling techniques include the synthetic minority oversampling technique (SMOTE) and Borderline-SMOTE, while the classifiers include random forest, decision tree, and AdaBoost.…”
Section: Introductionmentioning
confidence: 99%
“…Considered an open problem in the context of deep learning with ECG signals [4], the academy and industry recognize the class imbalance as an obstacle in developing effective deep learning models with a high amount of parameters by making the training phase harder. ML practitioners can avoid this problem using data augmentation techniques such as the Synthetic Minority Oversampling Technique (SMOTE) [5]. Recently, a new approach called few-shot learning [6] has been popularized and stands out in imaging processing problems.…”
Section: Introductionmentioning
confidence: 99%
“…This study aims to construct an effective hybrid important risk factor evaluation scheme for CKD stages 3a and 3b patients with MetS, based on ML predictive models. Our study used six well-known and effective ML techniques—random forest (RF), logistic regression (LGR), multivariate adaptive regression splines (MARS), extreme gradient boosting (XGBoost), gradient boosting with categorical features support (CatBoost), and light gradient boosting machine (LightGBM)—to develop ML predictive models [ 16 , 18 , 19 , 37 , 38 , 39 ]. The important risk factors identification results can provide valuable information regarding the prevention of CKD and health promotion.…”
Section: Introductionmentioning
confidence: 99%