Over the last decades, cognitive neuroscience has identified a distributed set of brain regions that are critical for attention. Strong anatomical overlap with brain regions critical for oculomotor processes suggests a joint network for attention and eye movements. However, the role of this shared network in complex, naturalistic environments remains understudied. Here, we investigated eye movements in relation to (un)attended sentences of natural speech. Combining simultaneously recorded eye tracking and magnetoencephalographic data with temporal response functions, we show that gaze tracks attended speech, a phenomenon we termed ocular speech tracking. Ocular speech tracking even differentiates a target from a distractor in a multi-speaker context and is further related to intelligibility. Moreover, we provide evidence for its contribution to neural differences in speech processing, emphasizing the necessity to consider oculomotor activity in future research and in the interpretation of neural differences in auditory cognition.