2015
DOI: 10.1075/ijlcr.1.1.04ale
|View full text |Cite
|
Sign up to set email alerts
|

Exploring big educational learner corpora for SLA research

Abstract: We consider the opportunities presented by big educational learner corpora for Second Language Acquisition (SLA). In particular, we focus on theEF Cambridge Open Language Database(EFCAMDAT), an open access database of student writings submitted toEnglishtown, the online school ofEF Education First. EFCAMDAT stands out for its size (33 million words, 85 thousand learners) and a range of 128 writing tasks covering all CEFR levels with data from learners from varying nationalities. We discuss methodological issue… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
28
0
1

Year Published

2016
2016
2024
2024

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 30 publications
(29 citation statements)
references
References 30 publications
0
28
0
1
Order By: Relevance
“…The data set itself is also a source of limitations. For example, it is worth looking into the potential effects of other variables, including tasks, teaching materials in Englishtown, and varying progress rates across learners (Alexopoulou et al., ).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The data set itself is also a source of limitations. For example, it is worth looking into the potential effects of other variables, including tasks, teaching materials in Englishtown, and varying progress rates across learners (Alexopoulou et al., ).…”
Section: Discussionmentioning
confidence: 99%
“…An alternative approach is to use the notion of national language (see Alexopoulou, Geertzen, Korhonen, & Meurers, ).…”
mentioning
confidence: 99%
“…However, prior studies have shown that native language taggers and parsers perform fairly well on the learner data in EFCAMDAT (Geertzen et al., ). For example, Alexopoulou, Geertzen, Korhonen, and Meurers () evaluated the accuracy of extraction of relative clauses, reporting an F score of 83.9%, and found that state‐of‐the‐art NLP tools provide reasonable quality.…”
Section: Methodsmentioning
confidence: 99%
“…Another source of difference might be how different L1 groups use formulaic sequences. In a corpus coming from an EFL teaching environment, it is possible that learners use subordinate clauses lifted from their input or task-prompts (Wray, 2002;Alexopoulou et al, 2015). The within L1 comparison of the development of different types of SCs (Fig.…”
Section: Resultsmentioning
confidence: 99%
“…These are statistics of the latest version (Version 2).9 The analysis of infinitival clauses requires analysis of unexpressed arguments that is beyond our scope; descriptively many infinitival clauses might not be considered full clauses.10 In this paper we do not consider the case of formulaic uses of subordinate clauses, that is, the possibility that learners use SCs lifted from their task prompts that potentially are unanalyzed and, as such, do not reflect true knowledge of SCs Alexopoulou et al (2015). present a method for automatic identification of formulaic uses.…”
mentioning
confidence: 99%