Recent progress in machine learning (ML) inspires the idea of improving (or learning) earth system models directly from the observations. Earth sciences already use data assimilation (DA), which underpins decades of progress in weather forecasting. DA and ML have many similarities: they are both inverse methods that can be united under a Bayesian (probabilistic) framework. ML could benefit from approaches used in DA, which has evolved to deal with real observations—these are uncertain, sparsely sampled, and only indirectly sensitive to the processes of interest. DA could also become more like ML and start learning improved models of the earth system, using parameter estimation, or by directly incorporating machine-learnable models. DA follows the Bayesian approach more exactly in terms of representing uncertainty, and in retaining existing physical knowledge, which helps to better constrain the learnt aspects of models. This article makes equivalences between DA and ML in the unifying framework of Bayesian networks. These help illustrate the equivalences between four-dimensional variational (4D-Var) DA and a recurrent neural network (RNN), for example. More broadly, Bayesian networks are graphical representations of the knowledge and processes embodied in earth system models, giving a framework for organising modelling components and knowledge, whether coming from physical equations or learnt from observations. Their full Bayesian solution is not computationally feasible but these networks can be solved with approximate methods already used in DA and ML, so they could provide a practical framework for the unification of the two. Development of all these approaches could address the grand challenge of making better use of observations to improve physical models of earth system processes.
This article is part of the theme issue ‘Machine learning for weather and climate modelling’.