The goal of this work is to improve the inference of nonaqueous-phase contaminated source zone architectures (CSA) from field data. We follow the idea that a physically motivated model for CSA formation helps in this inference by providing relevant relationships between observables and the unknown CSA. Typical multiphase models are computationally too expensive to be applied for inverse modeling; thus, state-of-the-art CSA identification techniques do not yet use physically based CSA formation models. To overcome this shortcoming, we apply a stochastic multiphase model with reduced computational effort that can be used to generate a large ensemble of possible CSA realizations. Further, we apply a reverse transport formulation in order to accelerate the inversion of transport-related data such as downgradient aqueous-phase concentrations. We combine these approaches within an inverse Bayesian methodology for joint inversion of CSA and aquifer parameters. Because we use multiphase physics to constrain and inform the inversion, (1) only physically meaningful CSAs are inferred; (2) each conditional realization is statistically meaningful; (3) we obtain physically meaningful spatial dependencies for interpolation and extrapolation of point-like observations between the different involved unknowns and observables, and (4) dependencies far beyond simple correlation; (5) the inversion yields meaningful uncertainty bounds. We illustrate our concept by inferring three-dimensional probability distributions of DNAPL residence, contaminant mass discharge, and of other CSA characteristics. In the inference example, we use synthetic numerical data on permeability, DNAPL saturation and downgradient aqueous-phase concentration, and we substantiate our claims about the advantages of emulating a multiphase flow model with reduced computational requirement in the inversion.
Key Points:Physically based stochastic multiphase model supports CSA inference and proper handling of scales Statistical consistency is maintained in a stochastic Bayesian framework Valuable detailed information about the unknown contaminant source architecture is revealed PUBLICATIONS dimensional inverse problem. It is high-dimensional because the involved unknowns are the entire threedimensional distribution of DNAPL saturation within the source zone, and finely resolved three-dimensional fields of aquifer parameters need to be included as indispensable covariates [Koch and Nowak, 2015]. The data available for inversion may comprise a variety of data types such as soil properties, hydraulic heads, groundwater velocities, locations of known pure-phase contaminant residence, and dissolved-phase concentration values. It is a nonlinear problem, because the relationships between observables and the unknowns are nonlinear. The nonlinearity is especially pronounced since multiphase flow is a strongly nonlinear and, in this specific case, even an unstable process [Illangasekare et al., 1995;Glass, 2003]. Nonuniqueness means that an infinite set of realizations, each of...