The consolidation of sequential experience is thought to enable efficient schema-based reconstruction of the past and prediction of the future, but the mechanism is unknown. Here, we present a computational model in which sequences are rapidly encoded in the hippocampus and replayed to train a neocortical deep generative network to predict the next item in each sequence. This is simulated using generative pre-trained transformers (GPTs), a variety of large language model. As well as capturing the gist of specific episodes, the neocortical network extracts statistical patterns that generalise to new situations. This model explains human performance on statistical learning and structural inference tasks, and accounts for gist or schema-based distortions in memories of narratives. It also shows how recent memory can contribute to inference and planning, capturing hippocampal and neocortical interactions as 'retrieval-augmented generation', in which specific memories retrieved from the hippocampus provide the context in working memory for prediction using the 'general knowledge' of the neocortical network. Furthermore, it shows how hippocampal traces could combine gist and detail for efficient encoding. The model suggests how episodic, semantic and working memory interact in the consolidation, (re)construction and planning of sequential experience.