Conceptual uncertainty is considered one of the major sources of uncertainty in groundwater flow modelling. In this regard, hypothesis testing is essential to increase system understanding by refuting alternative conceptual models. Often a stepwise approach, with respect to complexity, is promoted but hypothesis testing of simple groundwater models is rarely applied. We present an approach to model-based Bayesian hypothesis testing in a simple groundwater balance model, which involves optimization of a model in function of both parameter values and conceptual model through trans-dimensional sampling. We apply the methodology to the Wildman River area, Northern Territory, Australia, where we set up 32 different conceptual models. A factorial approach to conceptual model development allows for direct attribution of differences in performance to individual uncertain components of the conceptual model. The method provides a screening tool for prioritizing research efforts while also giving more confidence to the predicted water balance compared to a deterministic water balance solution. We show that the testing of alternative conceptual models can be done efficiently with a simple additive and linear groundwater balance model and is best done relatively early in the groundwater modelling workflow.One branch of hypothesis testing is based on the Bayesian probability theory. In Bayesian hypothesis testing, a prior belief about the suitability of a conceptual model is updated to a posterior belief by evaluating the model performance against data. The performance of alternative models are then compared in order to quantitatively rank and potentially reject hypotheses based on the so-called Bayes factor [9,10]. Fields of application in hydrogeology include groundwater modelling [11,12], hydrogeophysics [13,14] and solute transport modelling [15,16]. In many applications, the data is not sufficient to allow for discrimination between the models, in which case Bayesian model averaging is often applied where model predictions are weighed according to their performance against data [17].In hydrogeology, conceptual models are often tested in mathematical models (e.g., [18,19]). Since a model comprises the description of a system as a whole, all assumptions and the interaction of assumptions are tested at once. Also, data that does not directly relate to the conceptually uncertain feature can be integrated because of the holistic testing of the system.A stepwise approach in regards to the complexity of groundwater flow modelling and hypothesis testing is often promoted [20,21]. In this paper the simplicity of a model is defined in terms of setup and run-time. In a stepwise approach, complexity is gradually built up, and involves testing the models in each step to better understand the relative importance of various assumptions. This is opposed to starting with a complex model where all known processes and structural aspects are incorporated "because they exist, not because they matter" [20].Although there seems to be a consensu...