Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data

Zhu, Yinhao; Zabaras, Nicholas; Koutsourelakis, Phaedon‐Stelios; Perdikaris, Paris

doi:10.1016/j.jcp.2019.05.024

Cited by 814 publications

(498 citation statements)

References 86 publications

Supporting

Mentioning

495

Contrasting

Order By: Relevance

“…We also provided sufficient conditions for consistency, stability and convergence of functional approximation schemes to compute the solution of FDEs, thus extending the well-known Lax-Richtmyer theorem from PDEs to FDEs. As we suggested in [69], these results open the possibility to utilize techniques for highdimensional model representation such as deep neural networks [52,53,79] and numerical tensor methods [17,3,55,7,59,37] to represent nonlinear functionals and compute approximate solutions to functional differential equations. We conclude by emphasizing that the results we obtained in this paper can be extended to real-or complex-valued functionals in compact Banach spaces (see, e.g., [33,65]).…”

Section: Discussionmentioning

confidence: 83%

The numerical approximation of nonlinear functionals and functional differential equations

Venturi

2018

Physics Reports

View full text Add to dashboard Cite

We develop a rigorous convergence analysis for finite-dimensional approximations of nonlinear functionals, functional derivatives, and functional differential equations (FDEs) defined on compact subsets of real separable Hilbert spaces. The purpose this analysis is twofold: first, we prove that under rather mild conditions, nonlinear functionals, functional derivatives and FDEs can be approximated uniformly by high-dimensional multivariate functions and high-dimensional partial differential equations (PDEs), respectively. Second, we prove that such functional approximations can converge exponentially fast, depending on the regularity of the functional (in particular its Fréchet differentiability), and its domain. We also provide sufficient conditions for consistency, stability and convergence of functional approximation schemes to compute the solution of FDEs, thus extending the Lax-Richtmyer theorem from PDEs to FDEs. Numerical applications are presented and discussed for prototype nonlinear functional approximation problems, and for linear FDEs.

show abstract

Section: Discussionmentioning

confidence: 83%

The numerical approximation of nonlinear functionals and functional differential equations

Venturi

2018

Physics Reports

View full text Add to dashboard Cite

show abstract

“…The two factors together make the commonly used surrogate methods, such as Gaussian processes (Rasmussen & Williams, ) and polynomial chaos expansion (Xiu & Karniadakis, ), difficult to work. Deep neural networks have already exhibited a promising and impressive performance for surrogate modeling of forward models with high‐dimensional input and output fields (Kani & Elsheikh, ; Mo, Zabaras, et al, ; Mo, Zhu, et al ; Sun, ; Tripathy & Bilionis, ; Zhong et al, ; Zhu & Zabaras, ; Zhu et al, ). For example, in Tripathy and Bilionis () a deep neural network was proposed to build a surrogate model for a single‐phase flow forward model.…”

Section: Introductionmentioning

confidence: 99%

“…In Sun () and Zhong et al (), their surrogate methods for a single‐phase flow forward model and a multiphase flow forward model, respectively, were based on an adversarial network framework. In our previous studies (Mo, Zabaras, et al, ; Mo, Zhu, et al, ; Zhu & Zabaras, ; Zhu et al, ), a deep dense convolutional network (DDCN), which is based on a dense connection structure (Huang et al, ) for better information flow efficiency, was employed as the surrogate modeling framework. It showed a good performance in efficiently obtaining accurate surrogates of various forward models with high‐dimensional input‐output mappings.…”

Section: Introductionmentioning

confidence: 99%

“…First, we develop a CAAE method for parameterization of non‐Gaussian conductivity fields with heterogeneous conductivity within each facies that is suitable in the context of inverse modeling. Second, we adopt a multilevel residual strategy in our previous DDCN method (Mo, Zabaras, et al, ; Mo, Zhu, et al, ; Zhu & Zabaras, ; Zhu et al, ) to introduce a new DRDCN method with a substantially improved performance for surrogate modeling of highly complex mappings. Finally and most importantly, to the best of our knowledge, we present the first attempt to incorporate simultaneously the parameterization and surrogate methods to perform inversion for non‐Gaussian conductivities in solute transport modeling.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Integration of Adversarial Autoencoders With Residual Dense Convolutional Networks for Estimation of Non‐Gaussian Hydraulic Conductivities

Zabaras

Shi

et al. 2020

Water Resources Research

Self Cite

View full text Add to dashboard Cite

Inverse modeling for the estimation of non‐Gaussian hydraulic conductivity fields in subsurface flow and solute transport models remains a challenging problem. This is mainly due to the non‐Gaussian property, the nonlinear physics, and the fact that many repeated evaluations of the forward model are often required. In this study, we develop a convolutional adversarial autoencoder (CAAE) to parameterize non‐Gaussian conductivity fields with heterogeneous conductivity within each facies using a low‐dimensional latent representation. In addition, a deep residual dense convolutional network (DRDCN) is proposed for surrogate modeling of forward models with high‐dimensional and highly complex mappings. The two networks are both based on a multilevel residual learning architecture called residual‐in‐residual dense block. The multilevel residual learning strategy and the dense connection structure ease the training of deep networks, enabling us to efficiently build deeper networks that have an essentially increased capacity for approximating mappings of very high complexity. The CAAE and DRDCN networks are incorporated into an iterative ensemble smoother to formulate an inversion framework. The numerical experiments performed using 2‐D and 3‐D solute transport models illustrate the performance of the integrated method. The obtained results indicate that the CAAE is a robust parameterization method for non‐Gaussian conductivity fields with different heterogeneity patterns. The DRDCN is able to obtain accurate approximations of the forward models with high‐dimensional and highly complex mappings using relatively limited training data. The CAAE and DRDCN methods together significantly reduce the computation time required to achieve accurate inversion results.

show abstract

“…Since a number of authors have begun to consider the use of machine/deep learning for problems in traditional computational physics, see e.g. [1,2,3,4,5,6,7,8,9,10,11,12], we are motivated to consider methodologies that constrain the interpolatory results of a network to be contained within a physically admissible region. Quite recently, [13] proposed adding physical constraints to generative adversarial networks (GANs) also considering projection as we do, while stressing the interplay between scientific computing and machine learning; we refer the interested reader to their work for even more motivation for such approaches.…”

Section: Introductionmentioning

confidence: 99%

Coercing machine learning to output physically accurate results

Geng

Johnson

Fedkiw

2020

Journal of Computational Physics

View full text Add to dashboard Cite

Many machine/deep learning artificial neural networks are trained to simply be interpolation functions that map input variables to output values interpolated from the training data in a linear/nonlinear fashion. Even when the input/output pairs of the training data are physically accurate (e.g. the results of an experiment or numerical simulation), interpolated quantities can deviate quite far from being physically accurate. Although one could project the output of a network into a physically feasible region, such a postprocess is not captured by the energy function minimized when training the network; thus, the final projected result could incorrectly deviate quite far from the training data. We propose folding any such projection or postprocess directly into the network so that the final result is correctly compared to the training data by the energy function. Although we propose a general approach, we illustrate its efficacy on a specific convolutional neural network that takes in human pose parameters (joint rotations) and outputs a prediction of vertex positions representing a triangulated cloth mesh. While the original network outputs vertex positions with erroneously high stretching and compression energies, the new network trained with our physics prior remedies these issues producing highly improved results. there will be large errors inf (w, x T ) when compared to y T . On the other hand, although one could create a network with many degrees of freedom in order to capture y T =f (w, x T ) as accurately as desired, even exactly,f (w, x) could oscillate wildly and inaccurately when x is not equal to x T , i.e. overfitting. See, e.g. [14,15,16,17]. One needs to take great care when designing the network architecture in order to avoid underfitting while still allowing for enough regularization to also avoid overfitting. Likewise, the form of the energy function and nature of the numerical optimization techniques also need careful consideration. Some of the most popular methods include variants of BFGS [18,19,20] and a number of methods based on gradient descent [21,22] (see also [23] and the references therein) or interpreting gradient descent as a numerical approximation to an ordinary differential equation to be solved via various approaches motivated by order of accuracy [24,25] and adaptive time-stepping [26,27,28,29,30].Devising a network architecture with enough representative capability to alleviate underfitting while still being amenable to the regularization required to avoid overfitting, and subsequently applying numerical optimization techniques to an adequately designed energy in order to find reasonable parameters w is a quite difficult and mostly experimental endeavour. Thus, much of the progress made by the community emanates from the laborious creation of data sets that many researchers can consider in order to design network architectures and find suitable parameters w, see e.g. [31]. This is typically driven by a community (rather than an individual or group) effort, and state-of-the-art r...

show abstract

Physics-constrained deep learning for high-dimensional surrogate modeling and uncertainty quantification without labeled data

Cited by 814 publications

References 86 publications

The numerical approximation of nonlinear functionals and functional differential equations

The numerical approximation of nonlinear functionals and functional differential equations

Integration of Adversarial Autoencoders With Residual Dense Convolutional Networks for Estimation of Non‐Gaussian Hydraulic Conductivities

Coercing machine learning to output physically accurate results

Contact Info

Product

Resources

About