2022
DOI: 10.3390/s22197268
|View full text |Cite
|
Sign up to set email alerts
|

An Ensemble Approach for the Prediction of Diabetes Mellitus Using a Soft Voting Classifier with an Explainable AI

Abstract: Diabetes is a chronic disease that continues to be a primary and worldwide health concern since the health of the entire population has been affected by it. Over the years, many academics have attempted to develop a reliable diabetes prediction model using machine learning (ML) algorithms. However, these research investigations have had a minimal impact on clinical practice as the current studies focus mainly on improving the performance of complicated ML models while ignoring their explainability to clinical … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 47 publications
(13 citation statements)
references
References 39 publications
0
13
0
Order By: Relevance
“…Remarkably, Glucose, BMI, and Age were recognized as the most salient features [68]. Similarly, another study employed similar methods including RF and XGBoost, and employed LIM and SHAP as explainers [69].…”
Section: Analysis Of the Xai Evaluationmentioning
confidence: 95%
“…Remarkably, Glucose, BMI, and Age were recognized as the most salient features [68]. Similarly, another study employed similar methods including RF and XGBoost, and employed LIM and SHAP as explainers [69].…”
Section: Analysis Of the Xai Evaluationmentioning
confidence: 95%
“…For model development, the study cohort was randomly divided to create a 70%:30% training set to test set ratio. Because the number of ESRD cases was much smaller than the number of non-ESRD cases, we performed the synthetic minority over-sampling technique (SMOTE)-Tomek algorithms to balance the number of samples taken for imbalanced data [18,19]. Six machine learning models, including logistic regression, extra trees [20], random forest [21], gradient boosting decision tree (GBDT) [22], extreme gradient boosting models (XGBoost) [23], and light gradient boosting machine (LGBM) [24], are performed.…”
Section: Data Cleaning and Machine Learning Model Developmentmentioning
confidence: 99%
“…As a result, Choudary focuses on Precision and Recall.. Hafsa Binte Kibria used a ratio of training data : test data is 7:3. In her research, it was found that when using the Random Forest method, the irrelevant feature was glucose, blood pressure, and pregnancy [21]. Vaishali, tried to find important features by using feature selection based on genetic algorithms.…”
Section: Research On Important Features In the Pima Indian Databasementioning
confidence: 99%