Objective
To develop predictive models of COVID-19 outcomes, elucidate the influence of socioeconomic factors, and assess algorithmic racial fairness using a racially diverse patient population with high social needs.
Materials and Methods
Data included 7,102 patients with positive (RT-PCR) SARS-CoV-2 test at a safety-net system in Massachusetts. Linear and nonlinear classification methods were applied. A score based on a recurrent neural network and a transformer architecture was developed to capture the dynamic evolution of vital signs. Combined with patient characteristics, clinical variables, and hospital occupancy measures, this dynamic vital score was used to train predictive models.
Results
Hospitalizations can be predicted with an Area Under the ROC Curve (AUC) of 92% using symptoms, hospital occupancy, and patient characteristics, including social determinants of health. Parsimonious models to predict intensive care, mechanical ventilation, and mortality that used the most recent labs and vitals exhibited AUCs of 92.7%, 91.2%, and 94%, respectively. Early predictive models, using labs and vital signs closer to admission had AUCs of 81.1%, 84.9%, and 92%, respectively.
Discussion
The most accurate models exhibit racial bias, being more likely to falsely predict that Black patients will be hospitalized. Models that are only based on the dynamic vital score exhibited accuracies close to the best parsimonious models, although the latter also used laboratories.
Conclusion
This large study demonstrates that COVID-19 severity may accurately be predicted using a score that accounts for the dynamic evolution of vital signs. Further, race, social determinants of health, and hospital occupancy play an important role.