Hippocampal episodic memory is fundamentally relational, consisting of links between events and the spatial and temporal contexts in which they occurred. Such relations are also important over much shorter time periods, during online visual perception. For example, how do we assess the relative spatial positions of objects, their temporal order, or the relationship between their features? Here, we investigate the role of the hippocampus in such online relational processing by manipulating visual attention to different kinds of relations in a dynamic display. While undergoing high-resolution fMRI, participants viewed two images in rapid succession on each trial and performed one of three relational tasks, judging the images' relative: spatial positions, temporal onsets, or sizes. As a control, they sometimes also judged whether one image was tilted, irrespective of the other; this served as a baseline item task with no demands on relational processing. All hippocampal regions of interest (CA1, CA2/3/DG, subiculum) showed reliable deactivation when participants attended to relational vs. item information. Attention to temporal relations was associated with more robust deactivation than the other conditions. One possible interpretation of such deactivation is that it reflects hippocampal disengagement. If true, there should be reduced information content and noisier, less reliable patterns of activity in the hippocampus for the temporal vs. other tasks. Instead, analyses of multivariate activity patterns revealed more stable hippocampal representations in the temporal task. Additional analyses showed that this increased pattern similarity was not simply a reflection of the lower univariate activity. Thus, the hippocampus differentiates between relational and item processing even during online visual perception, and its representations of temporal relations in particular are robust and stable. Together, these findings suggest that the relational computations of the hippocampus, known to be important for memory, extend beyond this purpose, enabling the rapid online extraction of relational information in visual perception.