Natural scenes contain a multitude of cues that can support spatial perception, making it difficult to study. Here, in a series of pre-registered behavioral studies we quantify scene-specific spatial representations that generalize over tasks, stimulus durations, and participants. We presented 156 scene images at varying durations (125, 250, 1000ms) to independent groups of participants who either estimated or discriminated the egocentric distance to target objects. Not only were participants able to estimate distance in images seen only once, they also showed scene-specific deviations that strongly predicted behavior in the other task being performed by different observers. Given that the only commonality was the scenes themselves, pictorial features must be driving the observed responses. In fact, we found one such feature, the size of the ground plane, did explain the magnitude of the observed scene-specific deviations. Our results implicate a finely-tuned, rapid mechanism for integrating pictorial information into percepts of distance in natural images.