Proceedings of the 12th Workshop on Innovative Use of NLP for Building Educational Applications 2017
DOI: 10.18653/v1/w17-5018
|View full text |Cite
|
Sign up to set email alerts
|

Human and Automated CEFR-based Grading of Short Answers

Abstract: This paper is concerned with the task of automatically assessing the written proficiency level of non-native (L2) learners of English. Drawing on previous research on automated L2 writing assessment following the Common European Framework of Reference for Languages (CEFR), we investigate the possibilities and difficulties of deriving the CEFR level from short answers to open-ended questions, which has not yet been subjected to numerous studies up to date.The object of our study is twofold: to examine the intri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
3
2

Relationship

0
8

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 16 publications
0
10
0
Order By: Relevance
“…Tables 1 and 2 give the breakdown for each L1. To test the validity of our models on external data, we used the CEFR ASAG corpus (Tack et al, 2017), a collection of short answers to open-ended questions, written by French L1 learners of English and graded with CEFR levels. It consists of 712 texts written by different learners in response to three questions.…”
Section: Corporamentioning
confidence: 99%
“…Tables 1 and 2 give the breakdown for each L1. To test the validity of our models on external data, we used the CEFR ASAG corpus (Tack et al, 2017), a collection of short answers to open-ended questions, written by French L1 learners of English and graded with CEFR levels. It consists of 712 texts written by different learners in response to three questions.…”
Section: Corporamentioning
confidence: 99%
“…Lexical diversity—a type of lexical richness that refers to the range and variety of words used in a given response—is another aspect of task performance that researchers have looked into as a potential indicator of writing proficiency. According to Tack et al (2017), lexical features—in particular lexical diversity measures—found in short answers of between 30 and 200 words were the most informative predictors (compared with other measures, including syntactic, discursive, and readability features) of English L2 writing proficiency, and they were able to distinguish among A1, A2, B1, B2, and C levels of proficiency on the CEFR. This finding was also confirmed in a study by Crossley et al (2011) in which lexical density ( M ) was found to be one of the best predictors of English L2 writing proficiency among a variety of lexical measures.…”
Section: Study Background and Assessment Contextmentioning
confidence: 99%
“…The problem of SAQ scoring has received considerable attention, with several shared tasks and competitions orgnaized in the past (e.g., a SemEval shared task (Dzikovska et al, 2013), or the ASAP 2 Kaggle competition 3 ). The task is to predict the human labels for each instance and, traditionally, this has been done using n-grams (Heilman and Madnani, 2015) or a wide variety of linguistic features as in Tack et al (2017). Leacock and Chodorow (2003) use predicate-argument structure, pronominal reference, morphological analysis and synonyms to rate the questions.…”
Section: Automated Saq Scoringmentioning
confidence: 99%