Abstract. New precipitation (P) datasets are released regularly, following
innovations in weather forecasting models, satellite retrieval methods, and
multi-source merging techniques. Using the conterminous US as a case study,
we evaluated the performance of 26 gridded (sub-)daily P datasets to obtain
insight into the merit of these innovations. The evaluation was performed at
a daily timescale for the period 2008â2017 using the KlingâGupta efficiency
(KGE), a performance metric combining correlation, bias, and variability. As
a reference, we used the high-resolution (4âkm) Stage-IV gauge-radar P
dataset. Among the three KGE components, the P datasets performed worst
overall in terms of correlation (related to event identification). In terms
of improving KGE scores for these datasets, improved P totals (affecting
the bias score) and improved distribution of P intensity (affecting the
variability score) are of secondary importance. Among the 11 gauge-corrected
P datasets, the best overall performance was obtained by MSWEPÂ V2.2,
underscoring the importance of applying daily gauge corrections and
accounting for gauge reporting times. Several uncorrected P datasets
outperformed gauge-corrected ones. Among the 15 uncorrected P datasets, the
best performance was obtained by the ERA5-HRES fourth-generation reanalysis,
reflecting the significant advances in earth system modeling during the last
decade. The (re)analyses generally performed better in winter than in summer,
while the opposite was the case for the satellite-based datasets. IMERGHHÂ V05
performed substantially better than TMPA-3B42RTÂ V7, attributable to the many
improvements implemented in the IMERG satellite P retrieval algorithm.
IMERGHHÂ V05 outperformed ERA5-HRES in regions dominated by convective storms,
while the opposite was observed in regions of complex terrain. The ERA5-EDA
ensemble average exhibited higher correlations than the ERA5-HRES
deterministic run, highlighting the value of ensemble modeling. The WRF
regional convection-permitting climate model showed considerably more
accurate P totals over the mountainous west and performed best among the
uncorrected datasets in terms of variability, suggesting there is merit in
using high-resolution models to obtain climatological P statistics. Our
findings provide some guidance to choose the most suitable P dataset for a
particular application.