ABSTRACT
Objective: The Covid-19 outbreak has become the primary health problem of many countries due to health related, social, economic and individual effects. In addition to the development of outbreak prediction models, the examination of risk factors of the disease and the development of models for diagnosis are of high importance. This study introduces the Covid19PredictoR interface, a workflow where machine learning approaches are used for diagnosing Covid-19 based on clinical data such as routine laboratory test results, risk factors, information on co-existing health conditions.
Materials and Methods: Covid19PredictoR interface is an open source web based interface on R/Shiny (https://biodatalab.shinyapps.io/Covid19PredictoR/). Logistic regression, C5.0, decision tree, random forest and XGBoost models can be developed within the framework. These models can also be used for predictive purposes. Descriptive statistics, data pre-processing and model tuning steps are additionally provided during model development.
Results: Einsteindata4u dataset was analyzed with the Covid19PredictoR interface. With this example, the complete operation of the interface and the demonstration of all steps of the workflow have been shown. High performance machine learning models were developed for the dataset and the best models were used for prediction. Analysis and visualization of features (age, admission data and laboratory tests) were carried out for the case per model.
Conclusion: The use of machine learning algorithms to evaluate Covid-19 disease in terms of related risk factors is rapidly increasing. The application of these algorithms on various platforms creates application difficulties, repeatability and reproducibility problems. The proposed pipeline, which has been transformed into a standard workflow with the interface, offers a user-friendly structure that healthcare professionals with various background can easily use and report.