2021
DOI: 10.1111/tops.12529
|View full text |Cite
|
Sign up to set email alerts
|

Syllable Inference as a Mechanism for Spoken Language Understanding

Abstract: A classic problem in spoken language comprehension is how listeners perceive speech as being composed of discrete words, given the variable time-course of information in continuous signals. We propose a syllable inference account of spoken word recognition and segmentation, according to which alternative hierarchical models of syllables, words, and phonemes are dynamically posited, which are expected to maximally predict incoming sensory input. Generative models are combined with current estimates of context s… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
13
0

Year Published

2021
2021
2025
2025

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 9 publications
(13 citation statements)
references
References 112 publications
0
13
0
Order By: Relevance
“…Spoken language comprehension therefore relies on listeners going beyond the information given and inferring the presence of linguistic structure based on their knowledge of language. As such, many theories posit that linguistic structures—ranging from syllables to morphemes to “words” to syntactic structures—are constructed via an endogenous inference process [ 4 19 ]. On this view, also known as “analysis by synthesis” [ 20 ], speech triggers internal generation of memory representations (synthesis), which are compared to the sensory input (analysis).…”
Section: Introductionmentioning
confidence: 99%
“…Spoken language comprehension therefore relies on listeners going beyond the information given and inferring the presence of linguistic structure based on their knowledge of language. As such, many theories posit that linguistic structures—ranging from syllables to morphemes to “words” to syntactic structures—are constructed via an endogenous inference process [ 4 19 ]. On this view, also known as “analysis by synthesis” [ 20 ], speech triggers internal generation of memory representations (synthesis), which are compared to the sensory input (analysis).…”
Section: Introductionmentioning
confidence: 99%
“…This establishes that theories relying on particular acoustic properties being found in speech signals cannot account for human word recognition. Instead, listeners use multiple cues, on multiple levels of linguistic abstraction, to deduce the locations of word boundaries [18,[37][38][39][40][41][42][43][44], supporting proposals that view word segmentation as probabilistic inference [33,[45][46][47].…”
Section: Introductionmentioning
confidence: 72%
“…Consistent with these findings, later work using eye-tracking methodology has also revealed that listeners can use information from preceding rhythmic patterns to predict upcoming lexical stress (e.g., “ jury ” versus “ giraffe ,” Brown et al, 2011, 2015), and studies using the event-related potential (ERP) paradigm show that preceding cues can support prediction of word boundaries and later lexical processing and interpretations of what was heard (Breen et al, 2014). Further, recent research has also shown that speech rate can also facilitate prediction of upcoming weak syllables (Baese-Berk et al, 2019; see also, Brown et al, 2021), suggesting that preceding prosodic cues can have a pervasive role in predicting upcoming words.…”
Section: Variation Flexibility and Cue Weightingmentioning
confidence: 99%