Acoustic context influences speech perception, but contextual variability restricts this influence. Assgari and Stilp [J. Acoust. Soc. Am. 138, 3023–3032 (2015)] demonstrated that when categorizing vowels, variability in who spoke the preceding context sentence on each trial but not the sentence contents diminished the resulting spectral contrast effects (perceptual shifts in categorization stemming from spectral differences between sounds). Yet, how such contextual variability affects temporal contrast effects (TCEs) (also known as speaking rate normalization; categorization shifts stemming from temporal differences) is unknown. Here, stimuli were the same context sentences and conditions (one talker saying one sentence, one talker saying 200 sentences, 200 talkers saying 200 sentences) used in Assgari and Stilp [J. Acoust. Soc. Am. 138, 3023–3032 (2015)], but set to fast or slow speaking rates to encourage perception of target words as “tier” or “deer,” respectively. In Experiment 1, sentence variability and talker variability each diminished TCE magnitudes; talker variability also produced shallower psychometric function slopes. In Experiment 2, when speaking rates were matched across the 200-sentences conditions, neither TCE magnitudes nor slopes differed across conditions. In Experiment 3, matching slow and fast rates across all conditions failed to produce equal TCEs and slopes everywhere. Results suggest a complex interplay between acoustic, talker, and sentence variability in shaping TCEs in speech perception.