DefinitionGaussian processes (GPs) are local approximation techniques that model spatial data by placing (and updating) priors on the covariance structures underlying the data. Originally developed for geo-spatial contexts, they are also applicable in general contexts that involve computing and modeling with multi-level spatial aggregates, e.g., modeling a configuration space for crystallographic design, casting folding energies as a function of a protein's contact map, and formulation of vaccination policies taking into account social dynamics of individuals. Typically, we assume a parametrized covariance structure underlying the data to be modeled. We estimate the covariance parameters conditional on the locations for which we have observed data, and use the inferred structure to make predictions at new locations. GPs have a probabilistic basis that allow us to estimate variances at unsampled locations, aiding in the design of targeted sampling strategies.
Historical BackgroundThe underlying ideas behind GPs can be traced back to the geostatistics technique called kriging (Journel and Huijbregts 1992), named after the South African miner Danie Krige. Kriging in this literature was used to model response variables (e.g., ozone concentrations) over 2D spatial fields as realizations of a stochastic process. Sacks et al. (1989) described the use of kriging to model (deterministic) computer experiments. It took more than a decade from this point for the larger computer science community to investigate GPs for pattern analysis purposes. Thus, in the recent past, GPs have witnessed a revival primarily due to work in (MacKay 1997) and graphical models literature (Jordan 1998). Neal established the connection between Gaussian processes and neural networks with an infinite number of hidden units (Neal 1996). Such relationships allow us to take traditional learning techniques and re-express them as imposing a particular covariance structure on the joint distribution of inputs. For instance, we can take a trained neural network and mine the covariance structure implied by the weights (given mild assumptions such as a Gaussian prior over the weight space). Williams motivates the usefulness of such studies and describes common covariance functions (Williams 1998). Williams and Barber (1998) describe how the Gaussian process framework can be extended to classification in which the modeled variable is categorical. Since these publications were introduced, interest in GPs has exploded with rapid publications in conferences such as ICML, NIPS; see also the recently published book by Rasmussen and Williams (2006).
Scientific FundamentalsA GP can be formally defined as a collection of random variables, any finite subset of which have a (multivariate) normal distribution. For simplicity, we assume 2D spatially distributed (scalar) response variables t i , one for each location x i D OEx i1 ; x i2 where we have collected a data sample. Observe that, in the limiting case, each random variable has a Gaussian distribution (but i...