Introducing inequality constraints in Gaussian process (GP) models can lead to more realistic uncertainties in learning a great variety of real-world problems. We consider the finite-dimensional Gaussian approach from Maatouk and Bay (2017) which can satisfy inequality conditions everywhere (either boundedness, monotonicity or convexity). Our contributions are threefold. First, we extend their approach in order to deal with general sets of linear inequalities. Second, we explore several Markov Chain Monte Carlo (MCMC) techniques to approximate the posterior distribution. Third, we investigate theoretical and numerical properties of the constrained likelihood for covariance parameter estimation. According to experiments on both artificial and real data, our full framework together with a Hamiltonian Monte Carlo-based sampler provides efficient results on both data fitting and uncertainty quantification.
Adding inequality constraints (e.g. boundedness, monotonicity, convexity) into Gaussian processes (GPs) can lead to more realistic stochastic emulators. Due to the truncated Gaussianity of the posterior, its distribution has to be approximated. In this work, we consider Monte Carlo (MC) and Markov Chain Monte Carlo (MCMC) methods. However, strictly interpolating the observations may entail expensive computations due to highly restrictive sample spaces. Furthermore, having (constrained) GP emulators when data are actually noisy is also of interest for real-world implementations. Hence, we introduce a noise term for the relaxation of the interpolation conditions, and we develop the corresponding approximation of GP emulators under linear inequality constraints. We show with various toy examples that the performance of MC and MCMC samplers improves when considering noisy observations. Finally, on 2D and 5D coastal flooding applications, we show that more flexible and realistic GP implementations can be obtained by considering noise effects and by enforcing the (linear) inequality constraints.
We consider covariance parameter estimation for a Gaussian process under inequality constraints (boundedness, monotonicity or convexity) in fixed-domain asymptotics. We address the estimation of the variance parameter and the estimation of the microergodic parameter of the Matérn and Wendland covariance functions. First, we show that the (unconstrained) maximum likelihood estimator has the same asymptotic distribution, unconditionally and conditionally to the fact that the Gaussian process satisfies the inequality constraints. Then, we study the recently suggested constrained maximum likelihood estimator. We show that it has the same asymptotic distribution as the (unconstrained) maximum likelihood estimator. In addition, we show in simulations that the constrained maximum likelihood estimator is generally more accurate on finite samples. Finally, we provide extensions to prediction and to noisy observations. It is still challenging to obtain results on maximum likelihood estimation of microergodic parameters that would hold for very general classes of covariance functions. Nevertheless, significant contributions have been made for specific types of covariance functions. In particular, when considering the isotropic Matérn family of covariance functions, for input space dimension d = 1, 2, 3, a reparameterized quantity obtained from the variance and correlation length parameters is microergodic (Zhang, 2004). It has been shown in (Kaufman and Shaby, 2013), from previous results in (Du et al., 2009) and(Wang andLoh, 2011), that the maximum likelihood estimator of this microergodic parameter is consistent and asymptotically Gaussian distributed. Anterior results on the exponential covariance function have been also obtained in (Ying, 1991(Ying, , 1993.In this paper, we shall consider the situation where the trajectories of the Gaussian process are known to satisfy either boundedness, monotonicity or convexity constraints. Indeed, Gaussian processes with inequality constraints provide suitable regression models in application fields such as computer networking (monotonicity) (Golchi et al., 2015), social system analysis (monotonicity) (Riihimäki and Vehtari, 2010) and econometrics (monotonicity or positivity) (Cousin et al., 2016). Furthermore, it has been shown that taking the constraints into account may considerably improve the predictions and the predictive intervals for the Gaussian process (Da Veiga and Marrel, 2012;Golchi et al., 2015;Riihimäki and Vehtari, 2010).Recently, a constrained maximum likelihood estimator (cMLE) for the covariance parameters has been suggested in . Contrary, to the (unconstrained) maximum likelihood estimator (MLE) discussed above, the cMLE explicitly takes into account the additional information brought by the inequality constraints. In , it is shown, essentially, that the consistency of the MLE implies the consistency of the cMLE under boundedness, monotonicity or convexity constraints.The aim of this paper is to study the asymptotic conditional distributions of the MLE and the cM...
Given recent scientific advances, coastal flooding events can be properly modelled. Nevertheless, such models are computationally expensive (requiring many hours), which prevents their use for forecasting and warning. In addition, there is a gap between the model outputs and information actually needed by decision makers. The present work aims to develop and test a method capable of forecasting coastal flood information adapted to users’ needs. The method must be robust and fast and must integrate the complexity of coastal flood processes. The explored solution relies on metamodels, i.e., mathematical functions that precisely and efficiently (within minutes) estimate the results that would provide the numerical model. While the principle of relying on metamodel solutions is not new, the originality of the present work is to tackle and validate the entire process from the identification of user needs to the establishment and validation of the rapid forecast and early warning system (FEWS) while relying on numerical modelling, metamodelling, the development of indicators, and information technologies. The development and validation are performed at the study site of Gâvres (France). This site is subject to wave overtopping, so the numerical phase-resolving SWASH model is used to build the learning dataset required for the metamodel setup. Gaussian process- and random forest classifier-based metamodels are used and post-processed to estimate 14 indicators of interest for FEWS users. These metamodelling and post-processing schemes are implemented in an FEWS prototype, which is employed by local users and exhibits good warning skills during the validation period. Based on this experience, we provide recommendations for the improvement and/or application of this methodology and individual steps to other sites.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.