Spectral contrast effects, the perceptual magnification of spectral differences between sounds, have been widely shown to influence speech categorization. However, whether talker information alters spectral contrast effects was recently debated [Laing, Liu, Lotto, and Holt, Front. Psychol. 3, 1-9 (2012)]. Here, contributions of reliable spectral properties, between-talker and within-talker variability to spectral contrast effects in vowel categorization were investigated. Listeners heard sentences in three conditions (One Talker/One Sentence, One Talker/200 Sentences, 200 Talkers/200 Sentences) followed by a target vowel (varying from /ɪ/-/ɛ/ in F1, spoken by a single talker). Low-F1 or high-F1 frequency regions in the sentences were amplified to encourage /ɛ/ or /ɪ/ responses, respectively. When sentences contained large reliable spectral peaks (+20 dB; experiment 1), all contrast effect magnitudes were comparable. Talker information did not alter contrast effects following large spectral peaks, which were likely attributed to an external source (e.g., communication channel) rather than talkers. When sentences contained modest reliable spectral peaks (+5 dB; experiment 2), contrast effects were smaller following 200 Talkers/200 Sentences compared to single-talker conditions. Constant recalibration to new talkers reduced listeners' sensitivity to modest spectral peaks, diminishing contrast effects. Results bridge conflicting reports of whether talker information influences spectral contrast effects in speech categorization.
When spectral properties differ across successive sounds, this difference is perceptually magnified, resulting in spectral contrast effects (SCEs). Recently, Stilp, Anderson, and Winn [(2015) J. Acoust. Soc. Am. 137(6), 3466–3476] revealed that SCEs are graded: more prominent spectral peaks in preceding sounds produced larger SCEs (i.e., category boundary shifts) in categorization of subsequent vowels. Here, a similar relationship between spectral context and SCEs was replicated in categorization of voiced stop consonants. By generalizing this relationship across consonants and vowels, different spectral cues, and different frequency regions, acute and graded sensitivity to spectral context appears to be pervasive in speech perception.
Auditory perception is shaped by spectral properties of surrounding sounds. For example, when spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs; i.e.
Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur frequently in everyday speech perception.
All perception takes place in context. Recognition of a given speech sound is influenced by the acoustic properties of surrounding sounds. When the spectral composition of earlier (context) sounds (e.g., more energy at lower first formant [F 1 ] frequencies) differs from that of a later (target) sound (e.g., vowel with intermediate F 1), the auditory system magnifies this difference, biasing target categorization (e.g., towards higher-F 1 /ɛ/). Historically, these studies used filters to force context sounds to possess desired spectral compositions. This approach is agnostic to the natural signal statistics of speech (inherent spectral compositions without any additional manipulations). The auditory system is thought to be attuned to such stimulus statistics, but this has gone untested. Here, vowel categorization was measured following unfiltered (already possessing the desired spectral composition) or filtered sentences (to match spectral characteristics of unfiltered sentences). Vowel categorization was biased in both cases, with larger biases as the spectral prominences in context sentences increased. This confirms sensitivity to natural signal statistics, extending spectral context effects in speech perception to more naturalistic listening conditions. Importantly, categorization biases were smaller and more variable following unfiltered sentences, raising important questions about how faithfully experiments using filtered contexts model everyday speech perception.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.