Background
The rapid prevalence of coronavirus disease 2019 (COVID‐19) has caused a pandemic worldwide and affected the lives of millions. The potential fatality of the disease has led to global public health concerns. Apart from clinical practice, artificial intelligence (AI) has provided a new model for the early diagnosis and prediction of disease based on machine learning (ML) algorithms. In this study, we aimed to make a prediction model for the prognosis of COVID‐19 patients using data mining techniques.
Methods
In this study, a data set was obtained from the intelligent management system repository of 19 hospitals at Shahid Beheshti University of Medical Sciences in Iran. All patients admitted had shown positive polymerase chain reaction (PCR) test results. They were hospitalized between February 19 and May 12 in 2020, which were investigated in this study. The extracted data set has 8621 data instances. The data include demographic information and results of 16 laboratory tests. In the first stage, preprocessing was performed on the data. Then, among 15 laboratory tests, four of them were selected. The models were created based on seven data mining algorithms, and finally, the performances of the models were compared with each other.
Results
Based on our results, the Random Forest (RF) and Gradient Boosted Trees models were known as the most efficient methods, with the highest accuracy percentage of 86.45% and 84.80%, respectively. In contrast, the Decision Tree exhibited the least accuracy (75.43%) among the seven models.
Conclusion
Data mining methods have the potential to be used for predicting outcomes of COVID‐19 patients with the use of lab tests and demographic features. After validating these methods, they could be implemented in clinical decision support systems for better management and providing care to severe COVID‐19 patients.