Understanding the visual world is a constructive process. Whilst a frontal-hippocampal circuit is known to be essential for this task, little is known about the associated neuronal computations. Visual understanding appears superficially distinct from other known functions of this circuit, such as spatial reasoning and model-based planning, but recent models suggest deeper computational similarities. Here, using fMRI, we show that representations of a simple visual scene in these brain regions are relational and compositional - key computational properties theorised to support rapid construction of hippocampal maps. Using MEG, we show that rapid sequences of representations, akin to replay in spatial navigation and planning problems, are also engaged in visual construction. Whilst these sequences have previously been proposed as mechanisms to plan possible futures or learn from the past, here they are used to understand the present. Replay sequences form constructive hypotheses about possible scene configurations. These hypotheses play out in an optimal order for relational inference, progressing from predictable to uncertain scene elements, gradually constraining possible configurations, and converging on the correct scene configuration. Together, these results suggest a computational bridge between apparently distinct functions of hippocampal-prefrontal circuitry, and a role for generative replay in constructive inference and hypothesis testing.