Background: Malaria is still a major global health burden, with more than 3.2 billion people in 91 countries remaining at risk of the disease. Accurately distinguishing malaria from other diseases, especially uncomplicated malaria (UM) from non-malarial infections (nMI) remains a challenge. Furthermore, the success of rapid diagnostic tests (RDT) is threatened by Pfhrp2/3 deletions and decreased sensitivity at low parasitemia. Analysis of haematological indices can be used to support identification of possible malaria cases for further diagnosis, especially in travelers returning from endemic areas. As a new application for precision medicine, we aimed to evaluate machine learning (ML) approaches that can accurately classify nMI, UM and severe malaria (SM) using haematological parameters.
Methods: We obtained haematological data from 2,207 participants collected in Ghana; nMI (n=978), UM (n=526), and SM (n=703). Six different machine learning approaches were tested, to select the best approach. An artificial neural network (ANN) with three hidden layers was used for multi-classification of UM, SM, and uMI. Binary classifiers were developed to further identify the parameters that can distinguish UM or SM from nMI. Local interpretable model-agonistic explanations (LIME) were used to explain the binary classifiers.
Results: The multi-classification model had greater than 85 % training and testing accuracy to distinguish clinical malaria from nMI. To distinguish UM from nMI, our approach identified platelet counts, red blood cell (RBC) counts, lymphocyte counts and percentages as the top classifiers of UM with 0.801 test accuracy (AUC = 0.866 and F1-score = 0.747). To distinguish SM from nMI, the classifier had a test accuracy of 0.960 (AUC= 0.983, and F1-score = 0.944) with mean platelet volume and mean cell volume being the unique classifiers of SM. Random forest was used to confirm the classifications and it showed that platelet and RBC counts were the major classifiers of UM, regardless of possible confounders such as patient age and sampling location.
Conclusions: The study provides proof of concept methods that classify UM and SM from nMI, showing that ML approach is a feasible tool for clinical decision support. In the future, ML approaches could be incorporated into clinical decision-support algorithms for the diagnosis of acute febrile illness, and monitoring response to acute SM treatment particularly in endemic settings.