Variational and ensemble methods have been developed separately by various research and development groups and each brings its own benefits to data assimilation. In the last decade or so, various ways have been developed to combine these methods, especially with the aims of improving the background-error covariance matrices and of improving efficiency. The field has become confusing, even to many specialists, and so there is now a need to summarize the methods in order to show how they work, how they are related, what benefits they bring, why they have been developed, how they perform, and what improvements are pending. This article starts with a reminder of basic variational and ensemble techniques and shows how they can be combined to give the emerging ensemble-variational (EnVar) and hybrid methods. A key part of the article includes details of how localization is commonly represented.There has been a particular push to develop four-dimensional methods that are free of linearized forecast models. This article attempts to provide derivations of the formulations of most popular schemes. These are otherwise scattered throughout the literature or absent. It builds on the nomenclature used to distinguish between methods, and discusses further possible developments to the methods, including the representation of model error.Key Words: variational data assimilation; ensemble data assimilation; hybrid data assimilation; flow-dependent background-error covariances; localization; model error; nomenclature
Data assimilation and uncertaintyDealing with uncertainty is at the heart of data assimilation (DA). Forecast models (here those used in numerical weather prediction, NWP) use initial conditions that are imperfect, and are based upon imperfect representations of physical processes. It is well known that a free-running model will accumulate errors until its forecast is no longer useful (Tribbia and Baumhefner, 2004). The only way to restore usefulness is to allow the model to be influenced by observations (Leith, 1993). DA (Daley, 1991;Kalnay, 2002;Rabier, 2005) is the procedure of maintaining the link between evolving models and reality by updating the model with fresh observations. Researchers are working towards formal and robust mathematical methods that work in ways that are consistent with the model, the data, and their degree of uncertainty. DA is related to approaches used in other fields and which go by different names, e.g. state estimation (Wunsch, 2012); optimization (Biegler, 1997); history matching (Emerick, 2012); retrieval production (Rodgers, 2000); inverse modelling (Tarantola, 2005)), and there are additionally many ways of solving a DA problem. Most known DA methods are based on probabilistic theories (most, if not all, exploit Bayes' Theorem (Lorenc, 1986)) and each is made practical by making approximations.Traditionally there are three Bayesian-based strategies that allow the DA problem to be solved in approximate (hence suboptimal) ways. These may be categorized as the following. (i) Variation...