Abstract. Model predictions of biogeochemical fluxes at the landscape scale are highly uncertain, both with respect to stochastic (parameter) and structural uncertainty. In this study 5 different models (LASCAM, LASCAM-S, a selfdeveloped tool, SWAT and HBV-N-D) designed to simulate hydrological fluxes as well as mobilisation and transport of one or several nitrogen species were applied to the mesoscale River Fyris catchment in mid-eastern Sweden.Hydrological calibration against 5 years of recorded daily discharge at two stations gave highly variable results with Nash-Sutcliffe Efficiency (NSE) ranging between 0.48 and 0.83. Using the calibrated hydrological parameter sets, the parameter uncertainty linked to the nitrogen parameters was explored in order to cover the range of possible predictions of exported loads for 3 nitrogen species: nitrate (NO 3 ), ammonium (NH 4 ) and total nitrogen (Tot-N). For each model and each nitrogen species, predictions were ranked in two different ways according to the performance indicated by two different goodness-of-fit measures: the coefficient of determination R 2 and the root mean square error RMSE. A total of 2160 deterministic Single Model Ensembles (SME) was generated using an increasing number of members (from the 2 best to the 10 best single predictions). Finally the best SME for each model, nitrogen species and discharge station were selected and merged into 330 different Multi-Model Ensembles (MME). The evolution of changes in R 2 and RMSE was used as a performance descriptor of the ensemble procedure.Correspondence to: J.-F. Exbrayat (jean-francois.exbrayat@umwelt.uni-giessen.de) In each studied case, numerous ensemble merging schemes were identified which outperformed any of their members. Improvement rates were generally higher when worse members were introduced. The highest improvements were achieved for the nitrogen SMEs compiled with multiple linear regression models with R 2 selected members, which resulted in the RMSE decreasing by up to 90%.