In total, 17,278,392 adults were included (Fig. 1; cohort description in Table 1). Eleven per cent of individuals (1,851,868) had ethnicity recorded as mixed, South Asian, Black or other (hereafter referred to
Objective
To derive and validate a risk prediction algorithm to estimate hospital admission and mortality outcomes from coronavirus disease 2019 (covid-19) in adults.
Design
Population based cohort study.
Setting and participants
QResearch database, comprising 1205 general practices in England with linkage to covid-19 test results, Hospital Episode Statistics, and death registry data. 6.08 million adults aged 19-100 years were included in the derivation dataset and 2.17 million in the validation dataset. The derivation and first validation cohort period was 24 January 2020 to 30 April 2020. The second temporal validation cohort covered the period 1 May 2020 to 30 June 2020.
Main outcome measures
The primary outcome was time to death from covid-19, defined as death due to confirmed or suspected covid-19 as per the death certification or death occurring in a person with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in the period 24 January to 30 April 2020. The secondary outcome was time to hospital admission with confirmed SARS-CoV-2 infection. Models were fitted in the derivation cohort to derive risk equations using a range of predictor variables. Performance, including measures of discrimination and calibration, was evaluated in each validation time period.
Results
4384 deaths from covid-19 occurred in the derivation cohort during follow-up and 1722 in the first validation cohort period and 621 in the second validation cohort period. The final risk algorithms included age, ethnicity, deprivation, body mass index, and a range of comorbidities. The algorithm had good calibration in the first validation cohort. For deaths from covid-19 in men, it explained 73.1% (95% confidence interval 71.9% to 74.3%) of the variation in time to death (R
2
); the D statistic was 3.37 (95% confidence interval 3.27 to 3.47), and Harrell’s C was 0.928 (0.919 to 0.938). Similar results were obtained for women, for both outcomes, and in both time periods. In the top 5% of patients with the highest predicted risks of death, the sensitivity for identifying deaths within 97 days was 75.7%. People in the top 20% of predicted risk of death accounted for 94% of all deaths from covid-19.
Conclusion
The QCOVID population based risk algorithm performed well, showing very high levels of discrimination for deaths and hospital admissions due to covid-19. The absolute risks presented, however, will change over time in line with the prevailing SARS-C0V-2 infection rate and the extent of social distancing measures in place, so they should be interpreted with caution. The model can be recalibrated for different time periods, however, and has the potential to be dynamically updated as the pandemic evolves.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.