2017
DOI: 10.1111/cogs.12521
|View full text |Cite
|
Sign up to set email alerts
|

Linguistic Constraints on Statistical Word Segmentation: The Role of Consonants in Arabic and English

Abstract: Statistical learning is often taken to lie at the heart of many cognitive tasks, including the acquisition of language. One particular task in which probabilistic models have achieved considerable success is the segmentation of speech into words. However, these models have mostly been tested against English data, and as a result little is known about how a statistical learning mechanism copes with input regularities that arise from the structural properties of different languages. This study focuses on statist… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(4 citation statements)
references
References 52 publications
0
4
0
Order By: Relevance
“…Computational modeling work has started to investigate word segmentation in various languages (Batchelder, 2002;Blanchard et al, 2010;Caines, Altmann-Richer & Buttery, 2019;Daland, 2009;Fleck, 2008;Fourtassi, Börschinger, Johnson & Dupoux, 2013;Kastner & Adriaans, 2017;Saksida et al, 2017). Providing a thorough overview of their findings is beyond the scope of the present study, but we would like to highlight that most previous work attempts to check how a given algorithm performs cross-linguistically to argue for the validity of the algorithm the authors of those studies proposed, rather than to understand whether language properties affect segmentation in a systematic way (e.g., Batchelder, 2002;Boruta, Peperkamp, Crabbé and Dupoux, 2011;M.…”
Section: Cross-linguistic Performancementioning
confidence: 99%
See 1 more Smart Citation
“…Computational modeling work has started to investigate word segmentation in various languages (Batchelder, 2002;Blanchard et al, 2010;Caines, Altmann-Richer & Buttery, 2019;Daland, 2009;Fleck, 2008;Fourtassi, Börschinger, Johnson & Dupoux, 2013;Kastner & Adriaans, 2017;Saksida et al, 2017). Providing a thorough overview of their findings is beyond the scope of the present study, but we would like to highlight that most previous work attempts to check how a given algorithm performs cross-linguistically to argue for the validity of the algorithm the authors of those studies proposed, rather than to understand whether language properties affect segmentation in a systematic way (e.g., Batchelder, 2002;Boruta, Peperkamp, Crabbé and Dupoux, 2011;M.…”
Section: Cross-linguistic Performancementioning
confidence: 99%
“…Johnson, 2008;Pearl and Phillips, 2018;Phillips and Pearl, 2014a). Exceptions include studies that try to explain away cross-linguistic differences on the basis of corpus characteristics (e.g., Caines et al, 2019;Fourtassi et al, 2013), and work assessing the effect of prosodic and syntactic structure such as head direction (saliently, Gervain and Erra, 2012;Saksida et al, 2017), or the effects of input representation (Kastner & Adriaans, 2017). However, these factors are orthogonal to the present study (i.e., they are not necessarily confounded with morphological complexity).…”
Section: Cross-linguistic Performancementioning
confidence: 99%
“…For matching results, we used the previously published results produced by twelve ontology matching systems (SANOM [31], AML [13], LogMap [32], XMap [33], KEPLER [34], ALIN [9], DOME [11], Holontology [10], FCAMapX [35], [36], LogMapLt [32], ALOD2Vec [12], and Lily [37]) that [38] to evaluate the returned matches based on nine combinations of evaluation variants with crisp reference alignments: ra1-M1, ra1-M2, ra1-M3, ra2-M1, ra2-M2, ra2-M3, rar2-M1, rar2-M2, and rar2-M3 (ra1 is the original reference alignment; ra2 is an extension of ra1; and rar2 is an updated version of ra2 that deals with violations of conservativity). ra1-M1, ra2-M1, and rar2-M1 are used to evaluate only alignments between classes; ra1-M2, ra2-M2, and rar2-M2 are used to evaluate only alignments between properties; and ra1-M3, ra2-M3, and rar2-M3 are used to evaluate both alignments between classes and properties.…”
Section: A Experimental Setupmentioning
confidence: 99%
“…In recent computational work, Kastner and Adriaans (2017) have proposed that if the learner divides the input into consonants and vowels, it should be able to make progress on basic acquisition tasks in Semitic. A number of computer simulations found that ignoring the vowels in the input leads the learner to perform better on the task of segmenting the input stream into separate phonological words in Arabic: presumably, if the input consisted only of root consonants (and the occasional affix or clitic), this situation would lend itself surprisingly well to insertion of word boundaries.…”
Section: On Underlying Representations and Surface Formsmentioning
confidence: 99%