Effects of phonetic and indexical variability on talker normalization

Drown, Lee; Theodore, Rachel M.

doi:10.1121/1.5146955

Cited by 3 publications

(1 citation statement)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…On the other hand, variability-related processing costs have been observed for other forms of acoustic variability, including processing costs related to variation in speaking rate (Sommers & Barcroft, 2006 ; Sommers et al, 1994 ), speaking style (Sommers & Barcroft, 2006 ), and within-talker token variability (Drown & Theodore, 2020 ; Kapadia et al, 2023 ; Uchanski & Braida, 1998 ), though not for phonetically irrelevant variation, such as variation in stimulus amplitude (Sommers & Barcroft, 2006 ). Additionally, there is some evidence that nonhuman animals are also able to perform some form of “talker” normalization, though it remains to be seen whether the same mechanisms underlie this process in humans and nonhuman animals (Kriengwatana et al, 2014 ).…”

Section: Toward An Integrated Account Of Auditory Attention and Talke...mentioning

confidence: 99%

Why are listeners hindered by talker variability?

Luthra

2023

Psychon Bull Rev

View full text Add to dashboard Cite

Though listeners readily recognize speech from a variety of talkers, accommodating talker variability comes at a cost: Myriad studies have shown that listeners are slower to recognize a spoken word when there is talker variability compared with when talker is held constant. This review focuses on two possible theoretical mechanisms for the emergence of these processing penalties. One view is that multitalker processing costs arise through a resource-demanding talker accommodation process, wherein listeners compare sensory representations against hypothesized perceptual candidates and error signals are used to adjust the acoustic-to-phonetic mapping (an active control process known as contextual tuning). An alternative proposal is that these processing costs arise because talker changes involve salient stimulus-level discontinuities that disrupt auditory attention. Some recent data suggest that multitalker processing costs may be driven by both mechanisms operating over different time scales. Fully evaluating this claim requires a foundational understanding of both talker accommodation and auditory streaming; this article provides a primer on each literature and also reviews several studies that have observed multitalker processing costs. The review closes by underscoring a need for comprehensive theories of speech perception that better integrate auditory attention and by highlighting important considerations for future research in this area.

show abstract

Section: Toward An Integrated Account Of Auditory Attention and Talke...mentioning

confidence: 99%

Why are listeners hindered by talker variability?

Luthra

2023

Psychon Bull Rev

View full text Add to dashboard Cite

show abstract

Context effects in perception of vowels differentiated by F1 are not influenced by variability in talkers' mean F1 or F3

Mills

Shorey

Theodore

et al. 2022

The Journal of the Acoustical Society of America

View full text Add to dashboard Cite

Spectral properties of earlier sounds (context) influence recognition of later sounds (target). Acoustic variability in context stimuli can disrupt this process. When mean fundamental frequencies (f0’s) of preceding context sentences were highly variable across trials, shifts in target vowel categorization [due to spectral contrast effects (SCEs)] were smaller than when sentence mean f0’s were less variable; when sentences were rearranged to exhibit high or low variability in mean first formant frequencies (F1) in a given block, SCE magnitudes were equivalent [Assgari, Theodore, and Stilp (2019) J. Acoust. Soc. Am. 145(3), 1443–1454]. However, since sentences were originally chosen based on variability in mean f0, stimuli underrepresented the extent to which mean F1 could vary. Here, target vowels (/ɪ/-/ɛ/) were categorized following context sentences that varied substantially in mean F1 (experiment 1) or mean F3 (experiment 2) with variability in mean f0 held constant. In experiment 1, SCE magnitudes were equivalent whether context sentences had high or low variability in mean F1; the same pattern was observed in experiment 2 for new sentences with high or low variability in mean F3. Variability in some acoustic properties (mean f0) can be more perceptually consequential than others (mean F1, mean F3), but these results may be task-dependent.

show abstract

The effects of variability on context effects and psychometric function slopes in speaking rate normalization

King,

Sharpe,

Shorey

et al. 2024

The Journal of the Acoustical Society of America

View full text Add to dashboard Cite

Acoustic context influences speech perception, but contextual variability restricts this influence. Assgari and Stilp [J. Acoust. Soc. Am. 138, 3023–3032 (2015)] demonstrated that when categorizing vowels, variability in who spoke the preceding context sentence on each trial but not the sentence contents diminished the resulting spectral contrast effects (perceptual shifts in categorization stemming from spectral differences between sounds). Yet, how such contextual variability affects temporal contrast effects (TCEs) (also known as speaking rate normalization; categorization shifts stemming from temporal differences) is unknown. Here, stimuli were the same context sentences and conditions (one talker saying one sentence, one talker saying 200 sentences, 200 talkers saying 200 sentences) used in Assgari and Stilp [J. Acoust. Soc. Am. 138, 3023–3032 (2015)], but set to fast or slow speaking rates to encourage perception of target words as “tier” or “deer,” respectively. In Experiment 1, sentence variability and talker variability each diminished TCE magnitudes; talker variability also produced shallower psychometric function slopes. In Experiment 2, when speaking rates were matched across the 200-sentences conditions, neither TCE magnitudes nor slopes differed across conditions. In Experiment 3, matching slow and fast rates across all conditions failed to produce equal TCEs and slopes everywhere. Results suggest a complex interplay between acoustic, talker, and sentence variability in shaping TCEs in speech perception.

show abstract

Effects of phonetic and indexical variability on talker normalization

Cited by 3 publications

References 0 publications

Why are listeners hindered by talker variability?

Why are listeners hindered by talker variability?

Context effects in perception of vowels differentiated by F1 are not influenced by variability in talkers' mean F1 or F3

The effects of variability on context effects and psychometric function slopes in speaking rate normalization

Contact Info

Product

Resources

About