Recent research has demonstrated that pupillometry is a robust measure for quantifying listening effort. However, pupillary responses in listening situations where multiple cognitive functions are engaged and sustained over a period of time remain hard to interpret. This limits our conceptualisation and understanding of listening effort in realistic situations, because rarely in everyday life are people challenged by one task at a time. Therefore, the purpose of this experiment was to reveal the dynamics of listening effort in a sustained listening condition using a word repeat and recall task.Words were presented in quiet and speech-shaped noise at different signal-to-noise ratios (SNR). Participants were presented with lists of 10 words, and required to repeat each word after its presentation. At the end of the list, participants either recalled as many words as possible or moved on to the next list. Simultaneously, their pupil dilation was recorded throughout the whole experiment.When only word repeating was required, peak pupil dilation (PPD) was bigger in 0dB versus other conditions; whereas when recall was required, PPD showed no difference among SNR levels and PPD in 0dB was smaller than repeat-only condition.April 29, 2020 1/41Baseline pupil diameter and PPD followed different growth patterns across the 10 serial positions in conditions requiring recall: baseline pupil diameter built up progressively and plateaued in the later positions (but shot up at the onset of recall, i.e. the end of the list); PPD decreased at a pace quicker than in repeat-only condition.The current findings concur with the recent literature in showing that additional cognitive load during a speech intelligibility task could disturb the well-established relation between pupillary response and listening effort. Both the magnitude and temporal pattern of task-evoked pupillary response differ greatly in complex listening conditions, urging for more listening effort studies in complex and realistic listening situations.Introduction 1 Effortless as it seems, everyday communication is cognitively demanding. Degraded 2 speech input induced by adverse listening conditions (e.g., background noise, 3 reverberation etc.) and peripheral hearing loss introduces mismatch between perceived 4 acoustic signals and their canonical forms [1][2][3]. Resolving this mismatch demands more 5 resources from the finite pool of cognitive resources, leading to fewer resources for other 6 cognitive tasks and eventually overload [4,5]. Populations facing long-term auditory 7 challenges are specifically at risk. For instance, people with hearing impairment and 8 particularly those using cochlear implants (CI) often experience high and sustained 9 effort, even when speech recognition performance is similar [6-10]. CI listeners have to 10 engage and deploy more cognitive resources to achieve a satisfactory level of speech 11 communication due to electric hearing. Such elevated and sustained listening effort is 12 associated with detrimental psychosocial consequence...