Background. Transparent and accessible reporting of COVID-19 data is critical for public health efforts. Each state and union territory (UT) of India has its own mechanism for reporting COVID-19 data, and the quality of their reporting has not been systematically evaluated. We present a comprehensive assessment of the quality of COVID-19 data reporting done by the Indian state and union territory governments. This assessment informs the public health efforts in India and serves as a guideline for pandemic data reporting by other governments.
Methods. We designed a semi-quantitative framework to assess the quality of COVID-19 data reporting done by the states and union territories of India. This framework captures four key aspects of public health data reporting - availability, accessibility, granularity, and privacy. We then used this framework to calculate a COVID-19 Data Reporting Score (CDRS, ranging from 0 to 1) for 29 states based on the quality of COVID-19 data reporting done by the state during the two-week period from 19 May to 1 June, 2020. States that reported less than 10 total confirmed cases as of May 18 were excluded from the study.
Findings. Our results indicate a strong disparity in the quality of COVID-19 data reporting done by the state governments in India. CDRS varies from 0.61 (good) in Karnataka to 0.0 (poor) in Bihar and Uttar Pradesh, with a median value of 0.26. Only ten states provide a visual representation of the trend in COVID-19 data. Ten states do not report any data stratified by age, gender, comorbidities or districts. In addition, we identify that Punjab and Chandigarh compromised the privacy of individuals under quarantine by releasing their personally identifiable information on the official websites. Across the states, the CDRS is positively associated with the state's sustainable development index for good health and well-being (Pearson correlation: r=0.630, p=0.0003).
Interpretation. The disparity in CDRS across states highlights three important findings at the national, state, and individual level. At the national level, it shows the lack of a unified framework for reporting COVID-19 data in India, and highlights the need for a central agency to monitor or audit the quality of data reporting done by the states. Without a unified framework, it is difficult to aggregate the data from different states, gain insights from them, and coordinate an effective nationwide response to the pandemic. Moreover, it reflects the inadequacy in coordination or sharing of resources among the states in India. Coordination among states is particularly important as more people start moving across states in the coming months. The disparate reporting score also reflects inequality in individual access to public health information and privacy protection based on the state of residence.
Funding. J.Z. is supported by NSF CCF 1763191, NIH R21 MD012867-01, NIH P30AG059307, NIH U01MH098953 and grants from the Silicon Valley Foundation and the Chan-Zuckerberg Initiative.