Research into second language (L2) reading is an exponentially growing field. Yet, it still has a relatively short supply of comparable, ecologically valid data from readers representing a variety of first languages (L1). This article addresses this need by presenting a new data resource called MECO L2 (Multilingual Eye Movements Corpus), a rich behavioral eye-tracking record of text reading in English as an L2 among 543 university student speakers of 12 different L1s. MECO L2 includes a test battery of component skills of reading and allows for a comparison of the participants’ reading performance in their L1 and L2. This data resource enables innovative large-scale cross-sample analyses of predictors of L2 reading fluency and comprehension. We first introduce the design and structure of the MECO L2 resource, along with reliability estimates and basic descriptive analyses. Then, we illustrate the utility of MECO L2 by quantifying contributions of four sources to variability in L2 reading proficiency proposed in prior literature: reading fluency and comprehension in L1, proficiency in L2 component skills of reading, extralinguistic factors, and the L1 of the readers. Major findings included (a) a fundamental contrast between the determinants of L2 reading fluency versus comprehension accuracy, and (b) high within-participant consistency in the real-time strategy of reading in L1 and L2. We conclude by reviewing the implications of these findings to theories of L2 acquisition and outline further directions in which the new data resource may support L2 reading research.
According to Word and Paradigm Morphology (Matthews, 1974;Blevins, 2016), the word is the basic cognitive unit over which paradigmatic analogy operates to predict form and meaning of novel forms. Baayen et al. (2019bBaayen et al. ( , 2018 introduced a computational formalization of word and paradigm morphology which makes it possible to model the production and comprehension of complex words without requiring exponents, morphemes, inflectional classes, and separate treatment of regular and irregular morphology. This computational model, Linear Discriminative Learning (LDL), makes use of simple matrix algebra to move from words' forms to meanings (comprehension) and from words' meanings to their forms (production). In Baayen et al. (2018), we showed that LDL makes accurate predictions for Latin verb conjugations. The present study reports results for noun declension in Estonian. Consistent with previous findings, the model's predictions for comprehension and production are highly accurate. Importantly, the model achieves this high accuracy without being informed about stems, exponents, and inflectional classes. The speech errors produced by the model look like errors that native speakers might make. When the model is trained on incomplete paradigms, comprehension accuracy for unseen forms is hardly affected, but production accuracy decreases, reflecting the well-known asymmetry between comprehension and production. Unseen principal parts (i.e., nominative, genitive, and partitive singulars) are particularly difficult to produce, possibly due to their more distinctive forms. Removing principal parts from training, however, does not affect accuracy for other case forms. Model performance does not degrade either when the training data includes the alternative forms that are ubiquitous in Estonian. These results are consistent with the claim of Blevins (2008) that Estonian number and case inflection is organized in a way that facilitates the deduction of full paradigms from only a small number of forms.
According to Word and Paradigm Morphology (Matthews, 1974;Blevins, 2016), the word is the basic cognitive unit over which paradigmatic analogy operates to predict form and meaning of novel forms. Baayen et al. (2019bBaayen et al. ( , 2018 introduced a computational formalization of word and paradigm morphology which makes it possible to model the production and comprehension of complex words without requiring exponents, morphemes, inflectional classes, and separate treatment of regular and irregular morphology. This computational model, Linear Discriminative Learning (LDL), makes use of simple matrix algebra to move from words' forms to meanings (comprehension) and from words' meanings to their forms (production). In Baayen et al. ( 2018), we showed that LDL makes accurate predictions for Latin verb conjugations. The present study reports results for noun declension in Estonian. Consistent with previous findings, the model's predictions for comprehension and production are highly accurate. Importantly, the model achieves this high accuracy without being informed about stems, exponents, and inflectional classes. The speech errors produced by the model look like errors that native speakers might make. When the model is trained on incomplete paradigms, comprehension accuracy for unseen forms is hardly affected, but production accuracy decreases, reflecting the well-known asymmetry between comprehension and production. Unseen principal parts (i.e., nominative, genitive, and partitive singulars) are particularly difficult to produce, possibly due to their more distinctive forms. Removing principal parts from training, however, does not affect accuracy for other case forms. Model performance does not degrade either when the training data includes the alternative forms that are ubiquitous in Estonian. These results are consistent with the claim of Blevins (2008) that Estonian number and case inflection is organized in a way that facilitates the deduction of full paradigms from only a small number of forms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.