There have been numerous statistical and dynamical downscaling model comparisons. However, differences in model skill can be distorted by inconsistencies in experimental set-up, inputs and output format. This paper harmonizes such factors when evaluating daily precipitation downscaled over the Iberian Peninsula by the Statistical DownScaling Model (SDSM) and two configurations of the dynamical Weather Research and Forecasting Model (WRF) (one with data assimilation (D) and one without (N)). The ERA-Interim reanalysis at 0.75 • resolution provides common inputs for spinning-up and driving the WRF model and calibrating SDSM. WRF runs and SDSM output were evaluated against ECA&D stations, TRMM, GPCP and EOBS gridded precipitation for 2010-2014 using the same suite of diagnostics. Differences between WRF and SDSM are comparable to observational uncertainty, but the relative skill of the downscaling techniques varies with diagnostic. The SDSM ensemble mean, WRF-D and ERAI have similar correlation scores ( r = 0.45-0.7), but there were large variations amongst SDSM ensemble members ( r = 0.3-0.6). The best Linear Error in Probability Space ( LEPS = 0.001-0.007) and simulations of precipitation amount were achieved by individual members of the SDSM ensemble. However, the Brier Skill Score shows these members do not improve the prediction by ERA-Interim, whereas precipitation occurrence is reproduced best by WRF-D. Similar skill was achieved by SDSM when applied to station or gridded precipitation data. Given the greater computational demands of WRF compared with SDSM, clear statements of expected value-added are needed when applying the former to climate impacts and adaptation research.Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.