During grounded language comprehension, listeners must link the incoming linguistic signal to the visual world despite uncertainty in the input. Information gathered through visual fixations can facilitate understanding. But do listeners flexibly seek supportive visual information? Here, we propose that even young children can adapt their gaze and actively gather information for the goal of language comprehension. We present 2 studies of eye movements during real-time language processing, where the value of fixating on a social partner varies across different contexts. First, compared with children learning spoken English (n = 80), young American Sign Language (ASL) learners (n = 30) delayed gaze shifts away from a language source and produced a higher proportion of language-consistent eye movements. This result provides evidence that ASL learners adapt their gaze to effectively divide attention between language and referents, which both compete for processing via the visual channel. Second, English-speaking preschoolers (n = 39) and adults (n = 31) fixated longer on a speaker’s face while processing language in a noisy auditory environment. Critically, like the ASL learners in Experiment 1, this delay resulted in gathering more visual information and a higher proportion of language-consistent gaze shifts. Taken together, these studies suggest that young listeners can adapt their gaze to seek visual information from social partners to support real-time language comprehension.