2014
DOI: 10.1002/ets2.12022
|View full text |Cite
|
Sign up to set email alerts
|

A Study of the Use of the e‐rater® Scoring Engine for the Analytical Writing Measure of the GRE® revised General Test

Abstract: In this research, we investigated the feasibility of implementing the e‐rater® scoring engine as a check score in place of all‐human scoring for the Graduate Record Examinations® (GRE®) revised General Test (rGRE) Analytical Writing measure. This report provides the scientific basis for the use of e‐rater as a check score in operational practice. We proceeded with the investigation in four phases. In phase I, for both argument and issue prompts, we investigated the quality of human scoring consistency across i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
9
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
4
1

Relationship

2
3

Authors

Journals

citations
Cited by 5 publications
(9 citation statements)
references
References 16 publications
0
9
0
Order By: Relevance
“…In contrast, the primary adjudication threshold has been set at 1.5 for both the TOEFL independent task and the PRAXIS ® test's argumentative task under a contributory scoring approach. Similarly, it was recently reset to 2.0 for the TOEFL integrated task, with the potentially undesirable effects of the larger primary threshold on score separation compensated for by giving human ratings twice the weight of the machine scores for reporting purposes (Breyer et al, 2014;Ramineni, Trapani, & Williamson, 2015;Ramineni, Trapani, Williamson, Davey, & Bridgeman, 2012b;Ramineni et al, 2012a).…”
Section: Determining the Primary Adjudication Thresholdmentioning
confidence: 99%
See 3 more Smart Citations
“…In contrast, the primary adjudication threshold has been set at 1.5 for both the TOEFL independent task and the PRAXIS ® test's argumentative task under a contributory scoring approach. Similarly, it was recently reset to 2.0 for the TOEFL integrated task, with the potentially undesirable effects of the larger primary threshold on score separation compensated for by giving human ratings twice the weight of the machine scores for reporting purposes (Breyer et al, 2014;Ramineni, Trapani, & Williamson, 2015;Ramineni, Trapani, Williamson, Davey, & Bridgeman, 2012b;Ramineni et al, 2012a).…”
Section: Determining the Primary Adjudication Thresholdmentioning
confidence: 99%
“…The argument of critics is that “nonsense” and perhaps “obviously flawed” essays that result from gaming attempts can be detected by human readers but not always by built‐in machine detectors (i.e., advisories) in the automated scoring system (see Ramineni, Trapani, Williamson, Davey, & Bridgeman, , or Breyer et al, , for a description of the different advisories evaluated for the GRE‐AW section). For the purpose of this report, the sole machine scoring approach is not considered further for the GRE‐AW section because the consequential use of the GRE‐AW section scores is associated with relatively high stakes for individual test takers.…”
Section: Terminology and Motivationmentioning
confidence: 99%
See 2 more Smart Citations
“…The automated scoring models based on e‐rater have been successfully evaluated in recent years for the writing prompts included in the old GRE General Test (Ramineni, Trapani, Williamson, Davey, & Bridgeman, ), the TOEFL iBT ® test (Attali, Bridgeman, & Trapani, ; Ramineni, Trapani, Williamson, Davey, & Bridgeman, ), and the GRE revised General Test (Breyer et al, ). The current TOEFL iBT test uses e‐rater for operational scoring of the essay tasks, and the GRE uses e‐rater as a quality control on the reported human scores, thus allowing the programs to report scores efficiently and to use their human rater pool more effectively.…”
Section: Introductionmentioning
confidence: 99%