2018
DOI: 10.1080/08957347.2018.1464450
|View full text |Cite
|
Sign up to set email alerts
|

Validating human and automated scoring of essays against “True” scores

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
10
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 13 publications
(11 citation statements)
references
References 10 publications
1
10
0
Order By: Relevance
“…In addition to the studies of agreement between the automated score and other indicators of writing proficiency, there were also studies of correlation between the automated score and the estimated true score, which was the average of the scores given by a group of different raters to an essay. The common finding of such studies was that the correlation between automated and human scores was nearly the same as inter-rater correlation (Attali, 2007;Attali & Burstein, 2006;Cohen, Levi, & Ben-Simon, 2018). Apart from such analysis of correlation, researchers also tried to create a proper feature weights for the AES model, aiming to optimize the measurement properties and improve the reliability of AES scores (Attali, 2015;Bridgeman & Ramineni, 2017).…”
Section: Literature Reviewmentioning
confidence: 99%
“…In addition to the studies of agreement between the automated score and other indicators of writing proficiency, there were also studies of correlation between the automated score and the estimated true score, which was the average of the scores given by a group of different raters to an essay. The common finding of such studies was that the correlation between automated and human scores was nearly the same as inter-rater correlation (Attali, 2007;Attali & Burstein, 2006;Cohen, Levi, & Ben-Simon, 2018). Apart from such analysis of correlation, researchers also tried to create a proper feature weights for the AES model, aiming to optimize the measurement properties and improve the reliability of AES scores (Attali, 2015;Bridgeman & Ramineni, 2017).…”
Section: Literature Reviewmentioning
confidence: 99%
“…This study follows the same foundations of validating the AES framework, meaning that machine- and human-produced scores were compared as a way of validating scoring accuracy. In other words, human ratings were considered the “gold standard” for evaluating AES scoring performance (Cohen, Levi, & Ben-Simon, 2018; Powers, Escoffery, & Duchnowski, 2015).…”
Section: Coh-metrix Features and Validation Of Machine Scoringmentioning
confidence: 99%
“…This study follows the same foundations of validating the AES framework, meaning that machine-and human-produced scores were compared as a way of validating scoring accuracy. In other words, human ratings were considered the "gold standard" for evaluating AES scoring performance (Cohen, Levi, & Ben-Simon, 2018;Powers, Escoffery, & Duchnowski, 2015). In operational settings, human raters are trained to score by using a rubric and anchor essays, which help them align their rating processes with the score boundaries and designated writing construct.…”
Section: Validation Of the Essay Scoring Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…However, the usage of automated scoring systems depends on the obtained scores' being as similar as possible to human raters and their not having low reliability. Human raters are an important criterion for automated scoring systems (Cohen, Levi & Ben-Simon, 2018). Automated scoring results that have poor reliability and are incompatible with human raters may cause wrong decisions about individuals.…”
Section: Introductionmentioning
confidence: 99%