Background
Since the outbreak of the COVID-19 pandemic, the excess mortality P-score has gained prominence as a measure of pandemic burden. The P-score indicates the percentage by which observed deaths deviate from expected deaths. As the P-score is regularly used to compare excess mortality between countries, questions arise regarding the age dependency of the measure. In this paper we present formal and empirical results on the population structure bias of the P-score with a special focus on cross-country comparisons during the COVID-19 pandemic in Europe.
Methods
P-scores were calculated for European countries for 2021, 2022, and 2023 using data from the 2024 revision of the United Nations’ World Population Prospects and the HMDs Short Term Mortality Fluctuations data series. The expected deaths for 2021, 2022, and 2023 were estimated using a Lee–Carter forecast model assuming pre-pandemic conditions. P-score differences between countries were decomposed using a Kitagawa-type decomposition into excess-mortality and structural components. To investigate the sensitivity of P-score cross-country rankings to differences in population structure we calculated the rank-correlation between age-standardized and classical P-scores.
Results
The P-score is an average of age-specific percent excess deaths weighted by the age-distribution of expected deaths. It can be shown that the effect of differences in the distribution of deaths only plays a marginal role in a European comparison. In most cases, the excess mortality effect is the dominant effect. P-score rankings among European countries during the COVID-19 pandemic are similar under both age-standardized and classical P-scores.
Conclusions
Although the P-score formally depends on the age-distribution of expected deaths, this structural component only plays a minor role in a European comparison, as the distribution of deaths across the continent is similar. Thus, the P-score is suitable as a measure of excess mortality in a European comparison, as it mainly reflects the differences in excess mortality. However, this finding should not be extrapolated to global comparisons, where countries could have very different death distributions. In situations were P-score comparisons are biased age-standardization can be applied as a solution.