Abstract. In this paper, the performance of three machine-learning methods for predicting short-term
evolution and for reproducing the long-term statistics of a multiscale spatiotemporal Lorenz 96
system is examined. The methods are an echo state network (ESN, which is a type of reservoir computing; hereafter RC–ESN), a deep feed-forward artificial neural network (ANN), and a recurrent neural network (RNN) with long
short-term memory (LSTM; hereafter RNN–LSTM). This Lorenz 96 system has three tiers of nonlinearly interacting
variables representing slow/large-scale (X), intermediate (Y), and fast/small-scale (Z)
processes. For training or testing, only X is available; Y and Z are never known or used. We
show that RC–ESN substantially outperforms ANN and RNN–LSTM for short-term predictions, e.g.,
accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps
equivalent to several Lyapunov timescales. The RNN–LSTM outperforms ANN, and both methods show
some prediction skills too. Furthermore, even after losing the trajectory, data predicted by
RC–ESN and RNN–LSTM have probability density functions (pdf's) that closely match the true pdf – even at the tails. The pdf of the data predicted using ANN, however, deviates from the true
pdf. Implications, caveats, and applications to data-driven and data-assisted surrogate modeling
of complex nonlinear dynamical systems, such as weather and climate, are discussed.