Computer simulation often is used to study complex physical and engineering processes. Although a computer simulator often can be viewed as an inexpensive way to gain insight into a system, it still can be computationally costly. Much of the recent work on the design and analysis of computer experiments has focused on scenarios where the goal is to fit a response surface or process optimization. In this article we develop a sequential methodology for estimating a contour from a complex computer code. The approach uses a stochastic process model as a surrogate for the computer simulator. The surrogate model and associated uncertainty are key components in a new criterion used to identify the computer trials aimed specifically at improving the contour estimate. The proposed approach is applied to exploration of a contour for a network queuing system. Issues related to practical implementation of the proposed approach also are addressed.
For many expensive deterministic computer simulators, the outputs do not have replication error and the desired metamodel (or statistical emulator) is an interpolator of the observed data. Realizations of Gaussian spatial processes (GP) are commonly used to model such simulator outputs. Fitting a GP model to n data points requires the computation of the inverse and determinant of n × n correlation matrices, R, that are sometimes computationally unstable due to near-singularity of R. This happens if any pair of design points are very close together in the input space. The popular approach to overcome nearsingularity is to introduce a small nugget (or jitter) parameter in the model that is estimated along with other model parameters. The inclusion of a nugget in the model often causes unnecessary over-smoothing of the data. In this article, we propose a lower bound on the nugget that minimizes the over-smoothing and an iterative regularization approach to construct a predictor that further improves the interpolation accuracy. We also show that the proposed predictor converges to the GP interpolator.
Constrained blackbox optimization is a difficult problem, with most approaches coming from the mathematical programming literature. The statistical literature is sparse, especially in addressing problems with nontrivial constraints. This situation is unfortunate because statistical methods have many attractive properties: global scope, handling noisy objectives, sensitivity analysis, and so forth. To narrow that gap, we propose a combination of response surface modeling, expected improvement, and the augmented Lagrangian numerical optimization framework. This hybrid approach allows the statistical model to think globally and the augmented Lagrangian to act locally. We focus on problems where the constraints are the primary bottleneck, requiring expensive simulation to evaluate and substantial modeling effort to map out. In that context, our hybridization presents a simple yet effective solution that allows existing objective-oriented statistical approaches, like those based on Gaussian process surrogates and expected improvement heuristics, to be applied to the constrained setting with minor modification. This work is motivated by a challenging, real-data benchmark problem from hydrology where, even with a simple linear objective function, learning a nontrivial valid region complicates the search for a global minimum.
Gaussian process (GP) models are commonly used statistical metamodels for emulating expensive computer simulators. Fitting a GP model can be numerically unstable if any pair of design points in the input space are close together. Ranjan, Haynes, and Karsten (2011) proposed a computationally stable approach for fitting GP models to deterministic computer simulators. They used a genetic algorithm based approach that is robust but computationally intensive for maximizing the likelihood. This paper implements a slightly modified version of the model proposed by Ranjan et al. (2011) in the R package GPfit. A novel parameterization of the spatial correlation function and a clustering based multistart gradient based optimization algorithm yield robust optimization that is typically faster than the genetic algorithm based approach. We present two examples with R codes to illustrate the usage of the main functions in GPfit. Several test functions are used for performance comparison with the popular R package mlegp. We also use GPfit for a real application, i.e., for emulating the tidal kinetic energy model for the Bay
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.