Abstract. We present a workflow for efficient construction and calibration of large-scale groundwater models that includes the integration of airborne electromagnetic (AEM) data and hydrological data. In the first step, the AEM data are inverted to form a 3-D geophysical model. In the second step, the 3-D geophysical model is translated, using a spatially dependent petrophysical relationship, to form a 3-D hydraulic conductivity distribution. The geophysical models and the hydrological data are used to estimate spatially distributed petrophysical shape factors. The shape factors primarily work as translators between resistivity and hydraulic conductivity, but they can also compensate for structural defects in the geophysical model.The method is demonstrated for a synthetic case study with sharp transitions among various types of deposits. Besides demonstrating the methodology, we demonstrate the importance of using geophysical regularization constraints that conform well to the depositional environment. This is done by inverting the AEM data using either smoothness (smooth) constraints or minimum gradient support (sharp) constraints, where the use of sharp constraints conforms best to the environment. The dependency on AEM data quality is also tested by inverting the geophysical model using data corrupted with four different levels of background noise. Subsequently, the geophysical models are used to construct competing groundwater models for which the shape factors are calibrated. The performance of each groundwater model is tested with respect to four types of prediction that are beyond the calibration base: a pumping well's recharge area and groundwater age, respectively, are predicted by applying the same stress as for the hydrologic model calibration; and head and stream discharge are predicted for a different stress situation.As expected, in this case the predictive capability of a groundwater model is better when it is based on a sharp geophysical model instead of a smoothness constraint. This is true for predictions of recharge area, head change, and stream discharge, while we find no improvement for prediction of groundwater age. Furthermore, we show that the model prediction accuracy improves with AEM data quality for predictions of recharge area, head change, and stream discharge, while there appears to be no accuracy improvement for the prediction of groundwater age.