Listening to speech-in-noise can require substantial mental effort, even among younger normal-hearing adults. The task-evoked pupil response (TEPR) has been shown to track the increased effort exerted to recognize words or sentences in increasing noise. However, few studies have examined the trajectory of listening effort across longer, more natural, stretches of speech, or the extent to which expectations about upcoming listening difficulty modulate the TEPR. Thirteen younger normal-hearing adults listened to three repetitions of 60-s-long audiobook passages at two different signal-to-noise ratios (SNRs) while pupil size was recorded. There was a significant interaction between SNR, repetition, and baseline pupil size that affected sustained listening effort. At lower baseline pupil sizes, reflecting lower attention mobilization, TEPRs were more sustained in the harder SNR condition, particularly when attention mobilization remained low by the third presentation. At intermediate baseline pupil sizes, differences between conditions were largely absent, suggesting these listeners had optimally mobilized their attention for both SNRs. Lastly, at higher baseline pupil sizes, reflecting over-mobilization of attention, the effect of SNR was reversed in the first 30 s of story listening: participants initially appeared overwhelmed by the harder SNR condition, resulting in reduced TEPRs that recovered in the second half of the story. Listeners who had still under-mobilized their attention by the third repetition exhibited rapidly decreasing TEPRs in both SNRs. Together, these findings suggest that the way listening effort unfolds over time depends critically on the extent to which individuals successfully mobilize their attention in anticipation of difficult listening conditions.