Because different talkers produce their speech sounds differently, listeners benefit from maintaining distinct generative models (sets of beliefs) about the correspondence between acoustic information and phonetic categories for different talkers. A robust literature on phonetic recalibration indicates that when listeners encounter a talker who produces their speech sounds idiosyncratically (e.g., a talker who produces their /s/ sound atypically), they can update their generative model for that talker. Such recalibration has been shown to occur in a relatively talker-specific way. Because listeners in ecological situations often meet several new talkers at once, the present study considered how the process of simultaneously updating two distinct generative models compares to updating one model at a time. Listeners were exposed to two talkers, one who produced /s/ atypically and one who produced /∫/ atypically. Critically, these talkers only produced these sounds in contexts where lexical information disambiguated the phoneme's identity (e.g., epi_ode, flouri_ing). When initial exposure to the two talkers was blocked by voice (Experiment 1), listeners recalibrated to these talkers after relatively little exposure to each talker (32 instances per talker, of which 16 contained ambiguous fricatives). However, when the talkers were intermixed during learning (Experiment 2), listeners required more exposure trials before they were able to adapt to the idiosyncratic productions of these talkers (64 instances per talker, of which 32 contained ambiguous fricatives). Results suggest that there is a perceptual cost to simultaneously updating multiple distinct generative models, potentially because listeners must first select which generative model to update.
Over the past few decades, research into the function of the cerebellum has expanded far beyond the motor domain. A growing number of studies are probing the role of specific cerebellar subregions, including Crus I and Crus II, in higher-order cognition, though their involvement in speech and language processing remains underspecified. In the current fMRI study, we show evidence of the cerebellum’s sensitivity to variation in two well-studied psycholinguistic properties of words—lexical frequency and phonological neighborhood density—during passive, continuous listening of a podcast. To determine whether, and how, activity in the cerebellum correlates with these lexical properties we modeled each word separately using an amplitude-modulated regressor; time-locked to the onset of each word. At the group level, significant effects of both lexical properties landed in expected cerebellar subregions—Crus I and Crus II. The BOLD signal correlated with variation in both lexical properties, consistent how they are thought to affect the ease of word recognition. Activation patterns at the individual level also showed that the most probable sites for effects of phonological neighborhood and lexical frequency landed in Crus I and Crus II, though there was significant variability. These results suggest that the functional loci for these effects, especially frequency effects, can emerge in cerebellar subregions that vary across individuals. Although the exact cerebellar mechanisms used during speech and language processing are not yet evident, these findings highlight the cerebellum’s role in word-level processing during continuous listening.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.