Underreporting of COVID-19 cases and deaths is a hindrance to correctly modeling and monitoring the pandemic. This is primarily due to limited testing, lack of reporting infrastructure and a large number of asymptomatic infections. In addition, diagnostic tests (RT-PCR tests for detecting current infection) and serological antibody tests for IgG (to assess past infections) are imperfect. In particular, the diagnostic tests have a high false negative rate. Epidemiologic models with a latent compartment for unascertained infections like the Susceptible-Exposed-Infected-Removed (SEIR) models can provide predictions for unreported cases and deaths under certain assumptions. Typically, the number of unascertained cases is unobserved and thus we cannot validate these estimates for a real study except for simulation studies. Population-based seroprevalence studies can provide a rough estimate of the total number of infections and help us check epidemiologic model projections. In this paper, we develop a method to account for high false negative rates in RT-PCR in an extension to the classic SEIR model. We apply this method to Delhi, the national capital region of India, with a population of 19.8 million and a COVID-19 hotspot of the country, obtaining estimates of underreporting factor for cases at 34-53 times and that for deaths at 8-13 times. Based on a recently released serological survey for Delhi with an estimated 22.86% seroprevalence, we compute adjusted estimates of the true number of infections reported by the survey (after accounting for misclassification of the antibody test results) which is largely consistent with the model outputs, yielding an underreporting factor for cases from 30-42. Together with the model and the serosurvey, this implies approximately 96-98% cases in Delhi remained unreported and whereas only 109,140 cases were reported on July 10, the true number of infections varied somewhere between 4.4-4.6 million across different estimates. While repeated serological monitoring is resource intensive, model-based adjustments, run with the most up to date data, can provide a viable option to keep track of the unreported cases and deaths and gauge the true extent of transmission of this insidious virus.
BackgroundMany popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). MethodsUsing COVID-19 data for India from March 15 to June 18 to train the models, we generate predictions from each of the five models from June 19 to July 18. To compare prediction accuracy with respect to reported cumulative and active case counts and cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. ResultsFor active case counts, SMAPE values are 0.72 (SEIR-fansy) and 33.83 (eSIR). For cumulative case counts, SMAPE values are 1.76 (baseline) 23. (eSIR), 2.07 (SAPHIRE) and 3.20 (SEIR-fansy). For cumulative death counts, the SMAPE values are 7.13 (SEIR-fansy) and 26.30 (eSIR). For cumulative cases and deaths, we compute Pearson’s and Lin’s correlation coefficients to investigate how well the projected and observed reported COVID-counts agree. Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) counts as well. We compute underreporting factors as of June 30 and note that the SEIR-fansy model reports the highest underreporting factor for active cases (6.10) and cumulative deaths (3.62), while the SAPHIRE model reports the highest underreporting factor for cumulative cases (27.79).ConclusionsIn this comparative paper we describe five different models used to study full disease transmission of the SARS-Cov-2 disease transmission in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. Prediction of daily active number of cases does show appreciable variation across models. The largest variability across models is observed in predicting the “total” number of infections including reported and unreported cases. The degree of under-reporting has been a major concern in India.
Objective There has been much discussion and debate around the underreporting of COVID-19 infections and deaths in India. In this short report we first estimate the underreporting factor for infections from publicly available data released by the Indian Council of Medical Research on reported number of cases and national seroprevalence surveys. We then use a compartmental epidemiologic model to estimate the undetected number of infections and deaths, yielding estimates of the corresponding underreporting factors. We compare the serosurvey based ad hoc estimate of the infection fatality rate (IFR) with the model-based estimate. Since the first and second waves in India are intrinsically different in nature, we carry out this exercise in two periods: the first wave (April 1, 2020–January 31, 2021) and part of the second wave (February 1, 2021–May 15, 2021). The latest national seroprevalence estimate is from January 2021, and thus only relevant to our wave 1 calculations. Results Both wave 1 and wave 2 estimates qualitatively show that there is a large degree of “covert infections” in India, with model-based estimated underreporting factor for infections as 11.11 (95% credible interval (CrI) 10.71–11.47) and for deaths as 3.56 (95% CrI 3.48–3.64) for wave 1. For wave 2, underreporting factor for infections escalate to 26.77 (95% CrI 24.26–28.81) and to 5.77 (95% CrI 5.34–6.15) for deaths. If we rely on only reported deaths, the IFR estimate is 0.13% for wave 1 and 0.03% for part of wave 2. Taking underreporting of deaths into account, the IFR estimate is 0.46% for wave 1 and 0.18% for wave 2 (till May 15). Combining waves 1 and 2, as of May 15, while India reported a total of nearly 25 million cases and 270 thousand deaths, the estimated number of infections and deaths stand at 491 million (36% of the population) and 1.21 million respectively, yielding an estimated (combined) infection fatality rate of 0.25%. There is considerable variation in these estimates across Indian states. Up to date seroprevalence studies and mortality data are needed to validate these model-based estimates.
India experienced a massive surge in SARS-CoV-2 infections and deaths during April to June 2021 despite having controlled the epidemic relatively well during 2020. Using counterfactual predictions from epidemiological disease transmission models, we produce evidence in support of how strengthening public health interventions early would have helped control transmission in the country and significantly reduced mortality during the second wave, even without harsh lockdowns. We argue that enhanced surveillance at district, state, and national levels and constant assessment of risk associated with increased transmission are critical for future pandemic responsiveness. Building on our retrospective analysis, we provide a tiered data-driven framework for timely escalation of future interventions as a tool for policy-makers.
Background Many popular disease transmission models have helped nations respond to the COVID-19 pandemic by informing decisions about pandemic planning, resource allocation, implementation of social distancing measures, lockdowns, and other non-pharmaceutical interventions. We study how five epidemiological models forecast and assess the course of the pandemic in India: a baseline curve-fitting model, an extended SIR (eSIR) model, two extended SEIR (SAPHIRE and SEIR-fansy) models, and a semi-mechanistic Bayesian hierarchical model (ICM). Methods Using COVID-19 case-recovery-death count data reported in India from March 15 to October 15 to train the models, we generate predictions from each of the five models from October 16 to December 31. To compare prediction accuracy with respect to reported cumulative and active case counts and reported cumulative death counts, we compute the symmetric mean absolute prediction error (SMAPE) for each of the five models. For reported cumulative cases and deaths, we compute Pearson’s and Lin’s correlation coefficients to investigate how well the projected and observed reported counts agree. We also present underreporting factors when available, and comment on uncertainty of projections from each model. Results For active case counts, SMAPE values are 35.14% (SEIR-fansy) and 37.96% (eSIR). For cumulative case counts, SMAPE values are 6.89% (baseline), 6.59% (eSIR), 2.25% (SAPHIRE) and 2.29% (SEIR-fansy). For cumulative death counts, the SMAPE values are 4.74% (SEIR-fansy), 8.94% (eSIR) and 0.77% (ICM). Three models (SAPHIRE, SEIR-fansy and ICM) return total (sum of reported and unreported) cumulative case counts as well. We compute underreporting factors as of October 31 and note that for cumulative cases, the SEIR-fansy model yields an underreporting factor of 7.25 and ICM model yields 4.54 for the same quantity. For total (sum of reported and unreported) cumulative deaths the SEIR-fansy model reports an underreporting factor of 2.97. On October 31, we observe 8.18 million cumulative reported cases, while the projections (in millions) from the baseline model are 8.71 (95% credible interval: 8.63–8.80), while eSIR yields 8.35 (7.19–9.60), SAPHIRE returns 8.17 (7.90–8.52) and SEIR-fansy projects 8.51 (8.18–8.85) million cases. Cumulative case projections from the eSIR model have the highest uncertainty in terms of width of 95% credible intervals, followed by those from SAPHIRE, the baseline model and finally SEIR-fansy. Conclusions In this comparative paper, we describe five different models used to study the transmission dynamics of the SARS-Cov-2 virus in India. While simulation studies are the only gold standard way to compare the accuracy of the models, here we were uniquely poised to compare the projected case-counts against observed data on a test period. The largest variability across models is observed in predicting the “total” number of infections including reported and unreported cases (on which we have no validation data). The degree of under-reporting has been a major concern in India and is characterized in this report. Overall, the SEIR-fansy model appeared to be a good choice with publicly available R-package and desired flexibility plus accuracy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.