INVESTIGATING THE SUITABILITY OF IMPLEMENTING THE <i>E‐RATER</i>® SCORING ENGINE IN A LARGE‐SCALE ENGLISH LANGUAGE TESTING PROGRAM

Zhang, Mo; Breyer, F. Jay; Lorenz, Florian

doi:10.1002/j.2333-8504.2013.tb02343.x

Cited by 4 publications

(2 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When the human–machine discrepancy exceeds a given threshold, a second human rating is solicited. Whereas the specific thresholds employed in operational settings have not been reported, prior research has evaluated thresholds from 0.5 to 1.5 on a 5‐ or 6‐point holistic scoring scale (e.g., Zhang, Breyer, & Lorenz, ).…”

Section: Unusual Responses In Automated Essay Scoringmentioning

confidence: 99%

Evaluating the Advisory Flags and Machine Scoring Difficulty in the e‐rater® Automated Scoring Engine

Zhang

Chen

Ruan

2016

ETS Research Report Series

Self Cite

View full text Add to dashboard Cite

Successful detection of unusual responses is critical for using machine scoring in the assessment context. This study evaluated the utility of approaches to detecting unusual responses in automated essay scoring. Two research questions were pursued. One question concerned the performance of various prescreening advisory flags, and the other related to the degree of machine scoring difficulty and whether the size of the human–machine discrepancy could be predicted. The results suggested that some advisory flags operated more consistently across measures and tasks in detecting responses that the machine was likely to score differently from human raters than did other flags, and relatively little scoring difficulty was found for three of the four tasks examined in this study, with the relationship between machine and human scores being reasonably strong. Limitations and future studies are also discussed.

show abstract

Section: Unusual Responses In Automated Essay Scoringmentioning

confidence: 99%

Evaluating the Advisory Flags and Machine Scoring Difficulty in the e‐rater® Automated Scoring Engine

Zhang

Chen

Ruan

2016

ETS Research Report Series

Self Cite

View full text Add to dashboard Cite

show abstract

“…The effect of differences between human raters can substantially increase the bias in the final score without careful monitoring [8]. The manual correction makes human rater labor-intensive, timeconsuming, and expensive [9]. Based on these problems, a computer assessment is needed to help facilitate the assessment.…”

Section: Introductionmentioning

confidence: 99%

An Automated Essay Scoring Based on Neural Networks to Predict and Classify Competence of Examinees in Community Academy

Buditjahjanto¹,

Idhom²,

Munoto³

et al. 2022

TEM Journal

View full text Add to dashboard Cite

AES has been widely used in assessing student learning outcomes. However, few studies use Automated Essay Scoring (AES) to simultaneously determine the community academy's competency test scores and levels. This study aims to apply AES to assess essays on the competency certification test. The AES can predict the examinees' scores and classify examinees' competency levels. The method used to build AES uses Back Propagation Neural Networks (BPNN). BPNN was chosen because of its simplicity and ease in building the model. The results showed that the AES for predicting the examinee's competency value showed the MAE value is 0.061621 and the accuracy value is = 97.9665 %. The results of the classification of student competency levels show Accuracy= 0.9063, Precision= 0.9167, Recall= 0.8888, and F1 Score= 0.8857.

show abstract

Evaluating the Detection of Aberrant Responses in Automated Essay Scoring

Zhang

Chen

Ruan

2015

Quantitative Psychology Research

View full text Add to dashboard Cite

INVESTIGATING THE SUITABILITY OF IMPLEMENTING THE E‐RATER® SCORING ENGINE IN A LARGE‐SCALE ENGLISH LANGUAGE TESTING PROGRAM

Cited by 4 publications

References 12 publications

Evaluating the Advisory Flags and Machine Scoring Difficulty in the e‐rater® Automated Scoring Engine

Evaluating the Advisory Flags and Machine Scoring Difficulty in the e‐rater® Automated Scoring Engine

An Automated Essay Scoring Based on Neural Networks to Predict and Classify Competence of Examinees in Community Academy

Evaluating the Detection of Aberrant Responses in Automated Essay Scoring

Contact Info

Product

Resources

About