The coronavirus disease 2019 (COVID-19) has wreaked havoc globally, resulting in millions of cases and deaths. The objective of this study was to predict mortality in hospitalized COVID-19 patients in Zambia using machine learning (ML) methods based on factors that have been shown to be predictive of mortality and thereby improve pandemic preparedness. This research employed seven powerful ML models that included decision tree (DT), random forest (RF), support vector machines (SVM), logistic regression (LR), Naïve Bayes (NB), gradient boosting (GB), and XGBoost (XGB). These classifiers were trained on 1,433 hospitalized COVID-19 patients from various health facilities in Zambia. The performances achieved by these models were checked using accuracy, recall, F1-Score, area under the receiver operating characteristic curve (ROC_AUC), area under the precision-recall curve (PRC_AUC), and other metrics. The best-performing model was the XGB which had an accuracy of 92.3%, recall of 94.2%, F1-Score of 92.4%, and ROC_AUC of 97.5%. The pairwise Mann–Whitney U-test analysis showed that the second-best model (GB) and the third-best model (RF) did not perform significantly worse than the best model (XGB) and had the following: GB had an accuracy of 91.7%, recall of 94.2%, F1-Score of 91.9%, and ROC_AUC of 97.1%. RF had an accuracy of 90.8%, recall of 93.6%, F1-Score of 91.0%, and ROC_AUC of 96.8%. Other models showed similar results for the same metrics checked. The study successfully derived and validated the selected ML models and predicted mortality effectively with reasonably high performance in the stated metrics. The feature importance analysis found that knowledge of underlying health conditions about patients’ hospital length of stay (LOS), white blood cell count, age, and other factors can help healthcare providers offer lifesaving services on time, improve pandemic preparedness, and decongest health facilities in Zambia and other countries with similar settings.
COVID-19 has wreaked havoc globally, it has resulted in millions of cases and deaths. Scientist and public health professionals have used every form of advancing technology to curb the spread, predict the unforeseen adverse events, improve preparedness, and bring the world under control once more. The objective of this study was to predict mortality in hospitalized COVID-19 patients in Zambia using ML methods from factors that have been shown to be predictive of mortality. This research used powerful ML models in predicting COVID-19 mortality in 1,433 hospitalized patients in Zambia. The feature importance analysis helped in identification of important factors. The ML models of GB, RF, SVM, DT, LR, and NB were used and various performance metrics were checked. The feature importance analysis found that hospital length of stay (LOS) and white blood cell count were the most influential, other factors arranged in order of reducing importance included: age, wave, diabetes, hypertension, and sex. The top 3 performing models achieved the following: GB had accuracy of 91.5%, recall of 93.6%, F1 Score of 91.7%, and ROC-AUC of 96.9%. RF had accuracy of 90.9%, recall of 93.8%, F1 Score of 91.2%, and ROC-AUC of 96.8%. SVM had accuracy of 87.8%, recall of 91.2%, F1 Score of 88.2%, and ROC-AUC of 94.1%. Other models showed similar results for the same metrics. The study successfully derived and validated multiple ML models that predicted mortality effectively with reasonably high performance in stated metrics. The GB was the best suited for the data in this study. GB was thus recommended for similar studies with RF as best alternative. Knowledge of underlying health conditions about patients’ LOS, white blood cell count, age, and other factors can help healthcare providers offer lifesaving services on time, improve preparedness and decongest health facilities.
Background:The Corona virus, has caused havoc all over the world, it has left no country untouched resulting in millions of cases and deaths. In an effort to fight back, scientist and public health professionals have used every form of advancing technology to curb the spread, predict the unforeseen adverse events, improve preparedness, and bring the world under control once more.Objective:The objective of this study was to predict mortality in hospitalized COVID-19 patients in Zambia using ML methods from a number of predictors that have been shown to be predictive of mortality.Methods:This research used powerful ML models in predicting COVID-19 mortality in 1,433 hospitalized patients in Zambia. The feature importance analysis helped in identification of important factors. The ML models GB, RF, SVM, DT, LR, and NB were used the performance metrics checked for each model were accuracy, recall, specificity, precision, F1 Score, ROC-AUC, and PRC-AUC.Results:The feature importance analysis found that hospital length of stay (LOS) and white blood cell count were the most influential features, other factors arranged in order of reducing importance included: age, wave, diabetes, hypertension, and sex. The GB achieved accuracy of 91.5%, recall of 93.6%, F1 Score of 91.7%, and ROC-AUC of 96.9%. The RF achieved accuracy of 90.9%, recall of 93.8%, F1 Score of 91.2%, and ROC-AUC of 96.8%. The SVM achieved accuracy of 87.8%, recall of 91.2%, F1 Score of 88.2%, and ROC-AUC of 94.1%. The accuracy and ROC-AUC of other models were 88.2% and 90.7% respectively for DT, 81.9% and 90.1% respectively for LR, and 79.2% and 86.9% respectively for NB.Conclusion:The study successfully derived and validated multiple ML models that predicted mortality effectively with reasonably high performance in stated metrics. The GB was the best suited for the data in our study. GB was thus recommended for similar studies with RF as best alternative. Knowledge of underlying health conditions about patients (length of hospitalization (LOS), white blood cell count, age, sex, hypertension, diabetes, and other factors) can help healthcare providers offer lifesaving services on time, improve preparedness and decongest health facilities.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.