Background: Tests are scarce resources, especially in low and middle-income countries, and the optimization of testing programs during a pandemic is critical for the effectiveness of the disease control. Hence, we aim to use the combination of symptoms to build a regression model as a screening tool to identify people and areas with a higher risk of SARS-CoV-2 infection to be prioritized for testing.
Materials and Methods: We applied machine learning techniques and provided a visualization of potential regions with high densities of COVID-19 as a risk map. We performed a retrospective analysis of individuals registered in "Dados do Bem", an app-based symptom tracker in use in Brazil.
Results: From April 28 to July 16, 2020, 337,435 individuals registered their symptoms through the app. Of these, 49,721 participants were tested for SARS-CoV-2 infection, being 5,888 (11.8%) positive. Among self-reported symptoms, loss of smell (OR[95%CI]: 4.6 [4.4 - 4.9]), fever (2.6 [2.5 - 2.8]), and shortness of breath (2.1 [1.6-2.7]) were associated with SARS-CoV-2 infection. Our final model obtained a competitive performance, with only 7% of false-negative users among the predicted as negatives (NPV = 0.93). From the 287,714 users still not tested, our model estimated that only 34.5% are potentially infected, thus reducing the need for extensive testing of all registered users. The model was incorporated by the "Dados do Bem" app aiming to prioritize users for testing. We developed an external validation in the state of Goias and found that of the 465 users selected, 52% tested positive.
Conclusions: Our results showed that the combination of symptoms might predict SARS-Cov-2 infection and, therefore, can be used as a tool by decision-makers to refine testing and disease control strategies.