Abstract:There is an increasing need to consistently combine observations from different sensors to monitor the state of the land surface. In order to achieve this, robust methods based on the inversion of radiative transfer (RT) models can be used to interpret the satellite observations. This typically results in an inverse problem, but a major drawback of these methods is the computational complexity. We introduce the concept of Gaussian Process (GP) emulators: surrogate functions that accurately approximate RT models using a small set of input (e.g., leaf area index, leaf chlorophyll, etc.) and output (e.g., top-of-canopy reflectances or at sensor radiances) pairs. The emulators quantify the uncertainty of their approximation, and provide a fast and easy route to estimating the Jacobian of the original model, enabling the use of e.g., efficient gradient descent methods. We demonstrate the emulation of widely used RT models (PROSAIL and SEMIDISCRETE) and the coupling of vegetation and atmospheric (6S) RT models targetting particular sensor bands. A comparison with the full original model outputs shows that the emulators are a viable option to replace the original model, with negligible bias and discrepancies which are much smaller than the typical uncertainty in the observations. We also extend the theory of GP to cope with models with multivariate outputs (e.g., over the full solar reflective domain), and apply this to the emulation of PROSAIL, coupled 6S and PROSAIL and to the emulation of individual spectral components of 6S. In all cases, emulators successfully predict the full model output as well as accurately predict the gradient of the model calculated by finite differences, and produce speed ups between 10,000 and 50,000 times that of the original model. Finally, we use emulators to invert leaf area index (LAI), leaf chlorophyll content (C ab ) and equivalent leaf water thickness (C w ) from a time series of observations from Sentinel-2/MSI, Sentinel-3/SLSTR and Proba-V observations. We use sophisticated Hamiltonian Markov Chain Monte Carlo (MCMC) methods that exploit the speed of the emulators as well as the gradient estimation, a variational data assimilation (DA) method that extends the problem with temporal regularisation, and a particle filter using a regularisation model. The variational and particle filter approach appear more successful (meaning parameters closer to the truth, and smaller uncertainties) than the MCMC approach as a result of using the temporal regularisation mode. These work therefore suggests that GP emulators are a practical way to implement sophisticated parameter retrieval schemes in an era of increasing data volumes.