Abstract. Skilful hydrological forecasts at sub-seasonal to seasonal lead times would be extremely beneficial for decisionmaking in water resources management, hydropower operations, and agriculture, especially during drought conditions. Ensemble Streamflow Prediction (ESP) is a well-established method for generating an ensemble of streamflow forecasts in 10 the absence of skilful future meteorological predictions, instead using Initial Hydrological Conditions (IHCs), such as soil moisture, groundwater, and snow, as the source of skill. We benchmark when and where the ESP method is skilful across a diverse sample of 314 catchments in the UK and explore the relationship between catchment storage and ESP skill. The GR4J hydrological model was forced with historic climate sequences to produce 51-member ensemble of streamflow hindcasts. We evaluated forecast skill seamlessly from lead times of 1-day to 12-months initialised at the first of each month over a 50-year 15 hindcast period from 1965-2015. Results show ESP was skilful against a climatology benchmark forecast in the majority of catchments across all lead times up to a year ahead, but the degree of skill was strongly conditional on lead time, forecast initialisation month, and individual catchment location and storage properties. UK-wide mean ESP skill decays exponentially as a function of lead time with mean squared error skill scores across the year of 0.839, 0.303, and 0.179 for 1-day, 1-month, and 3-month lead times, respectively. However, skill was not uniform across all initialisation months. For lead times up to 1-20 month, ESP skill was higher than average when initialised in summer and lower in winter, whereas for longer seasonal and annual lead times skill was highest when initialised in autumn and winter months and lowest in April. ESP is most skilful in the south and east of the UK, where slower responding catchments with higher soil moisture and groundwater storage are mainly located; correlation between catchment Baseflow Index (BFI) and ESP skill was very strong ( = 0.896 at 1-month lead time). This is in contrast to the more highly responsive catchments in the north and west which are generally not skilful 25 at seasonal lead times. Overall, this work provides a scientifically defensible justification for when and where use of such a relatively simple forecasting approach is appropriate in the UK and creates a low cost benchmark against which potential skill improvements from more sophisticated hydro-meteorological ensemble prediction systems can be judged.Hydrol. Earth Syst. Sci. Discuss., https://doi