Background
The focus of many studies is to estimate the effect of risk factors on outcomes, yet results may be dependent on the choice of other risk factors or potential confounders to include in a statistical model. For complex and unexplored systems, such as the COVID-19 spreading process, where a priori knowledge of potential confounders is lacking, data-driven empirical variable selection methods may be primarily utilized. Published studies often lack a sensitivity analysis as to how results depend on the choice of confounders in the model. This study showed variability in associations of short-term air pollution with COVID-19 mortality in Germany under multiple approaches accounting for confounders in statistical models.
Methods
Associations between air pollution variables PM2.5, PM10, CO, NO, NO2, and O3 and cumulative COVID-19 deaths in 400 German districts were assessed via negative binomial models for two time periods, March 2020–February 2021 and March 2021–February 2022. Prevalent methods for adjustment of confounders were identified after a literature search, including change-in-estimate and information criteria approaches. The methods were compared to assess the impact on the association estimates of air pollution and COVID-19 mortality considering 37 potential confounders.
Results
Univariate analyses showed significant negative associations with COVID-19 mortality for CO, NO, and NO2, and positive associations, at least for the first time period, for O3 and PM2.5. However, these associations became non-significant when other risk factors were accounted for in the model, in particular after adjustment for mobility, political orientation, and age. Model estimates from most selection methods were similar to models including all risk factors.
Conclusion
Results highlight the importance of adequately accounting for high-impact confounders when analyzing associations of air pollution with COVID-19 and show that it can be of help to compare multiple selection approaches. This study showed how model selection processes can be performed using different methods in the context of high-dimensional and correlated covariates, when important confounders are not known a priori. Apparent associations between air pollution and COVID-19 mortality failed to reach significance when leading selection methods were used.