A. BackgroundThe novel coronavirus disease (COVID-19) emerged in late 2019 has shown that research done with open data could be the cornerstone for overcoming the need for collaborative, optimized and urgent analysis. Although several articles have been published, identification of variables that can have correlation with positive PCR results is still a challenge. In this paper we show a concrete example of open data analysis from 910 patients attended in the hospital undergoing SARS-CoV-2 RT-PCR in three private institutions in S˜ao Paulo, Brazil.B. ResultsWe performed an exploratory analysis using principal component analysis, feature selection and predictive algorithms to test for associations between a number of laboratory test abnormalities and the SARS-CoV-2 RT-PCR result. More concretely, we found a set of 18 variables that showed some association with a positive PCR result.C. ConclusionAmong these variables elevated lactic dehydrogenase (LDH) and d-dimer were the most correlated with a positive RT-PCR. We developed a classifier that achieved 76% mean accuracy, 77% mean precision and 92% mean sensitivity to identify individuals with COVID-19.