Abstract. Reducing uncertainty and improving robustness and spatio-temporal extrapolation capabilities remain key challenges in hydrological modeling especially for flood forecasting over large areas. Parsimonious model structures and effective optimization strategies are crucially needed to tackle the difficult issue of distributed hydrological model calibration from sparse integrative discharge data, that is in general high dimensional inverse problems. This contribution presents the first evaluation of Variational Data Assimilation (VDA), very well suited to this context but still rarely employed in hydrology because of high technicality, and successfully applied here to the spatially distributed calibration of a newly taylored grid-based parsimonious model structure and corresponding adjoint, over a large sample. It is based on the Variational Data Assimilation (VDA) framework of SMASH (Spatially distributed Modelling and ASsimilation for Hydrology) platform, underlying the French national flash flood forecasting system Vigicrues Flash. It proposes an upgraded distributed hourly rainfall-runoff model structure employing GR-based operators, including a non-conservative flux, and its adjoint obtained by automatic differentiation for VDA. The performances of the approach are assessed over annual, seasonal and floods timescales via standard performance metrics and in spatio-temporal validation. The gain of using the proposed non-conservative 6-parameters model structure is highlighted in terms of performance and robustness, compared to a simpler 3-parameters structure. Spatially distributed calibrations lead to a significant gain in terms of reaching high performances in calibration and temporal validation on the catchments sample, with median efficiencies respectively of NSE = 0.88 (resp. 0.85) and NSE = 0.8 (resp. 0.79) over the total time window on period p2 (resp. p1). Simulated signatures in temporal validation over 1443 (resp. 1522) flood events on period p2 (resp. p1) are quite good with median flood (NSE; KGE) of (0.63; 0.59) (resp. (0.55; 0.53)). Spatio-temporal validations, i.e. on pseudo ungauged cases, lead to encouraging performances also. Moreover, the influence of certain catchment characteristics on model performance and parametric sensitivity is analyzed. Best performances are obtained for Oceanic and Mediterranean basins whereas it performs less well over Uniform basins with significant influence of multi-frequency hydrogeological processes. Interestingly, regional sensitivity analysis revealed that the non conservative water exchange parameter and the production parameter, impacting the simulated runoff amount, are the most sensitive parameters along with the routing parameter especially for faster responding catchments. This study is a first step in the construction of a flexibe and versatile multi-model and optimization framework with hydbrid methods for regional hydrological modeling with multi-source data assimilation.