Along the ventral stream, cortical representations of brief, static stimuli become gradually more invariant to identity-preserving transformations. In the presence of long, ethologically relevant dynamic stimuli, higher invariance should imply temporally persistent representations at the top of this functional hierarchy. However, such stimuli could engage adaptive and predictive processes, whose impact on neural coding dynamics is unknown. More generally, coding dynamics in the presence of temporally structured stimuli are not understood. By probing the rodent analogue of the ventral stream with movies, we uncovered a hierarchy of temporal scales along this pathway, with deeper areas encoding visual information more persistently. Furthermore, the impact of intrinsic dynamics on the stability of stimulus representations gradually grows along the hierarchy. These results suggest that feedforward computations in the cortical hierarchy build up invariance even for dynamic, temporally structured stimuli, and that intrinsic processing contributes to the stabilization of representations in noisy, changing environments.