2020
DOI: 10.1002/ets2.12293
|View full text |Cite
|
Sign up to set email alerts
|

Evaluations of Automated Scoring Systems in Practice

Abstract: This research report provides a description of the processes of evaluating the “deployability” of automated scoring (AS) systems from the perspective of large‐scale educational assessments in operational settings. It discusses a comprehensive psychometric evaluation that entails analyses that take into consideration the specific purpose of AS, the test design, the quality of human scores, the data collection design needed to train and evaluate the AS model, and the application of statistics and evaluation crit… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
2

Relationship

0
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 24 publications
0
6
0
Order By: Relevance
“…As we see an increasing number of studies that explore applying or extending ML methods to solving measurement problems, we hope to call the attention of the measurement community to identify, discuss, and work together to address the risks. This concern is shared by some of the articles we reviewed (e.g., Alexander et al, 2020;Anderson et al, 2020;Higgins & Heilman, 2014;Rauthmann, 2020;Rotou & Rupp, 2020;Stachl et al, 2020).…”
Section: Risks Associated With Involving ML In Measurementmentioning
confidence: 95%
See 1 more Smart Citation
“…As we see an increasing number of studies that explore applying or extending ML methods to solving measurement problems, we hope to call the attention of the measurement community to identify, discuss, and work together to address the risks. This concern is shared by some of the articles we reviewed (e.g., Alexander et al, 2020;Anderson et al, 2020;Higgins & Heilman, 2014;Rauthmann, 2020;Rotou & Rupp, 2020;Stachl et al, 2020).…”
Section: Risks Associated With Involving ML In Measurementmentioning
confidence: 95%
“…The other four articles include examples of new processes for validating ML-enhanced measurement. Lottridge et al (2020) and Rotou and Rupp (2020) suggested practical processes for evaluating automated scoring systems. Speer (2021) investigated the validity and generalizability evidence for scores generated using natural language processing (NLP) algorithms.…”
Section: Risks Associated With Involving ML In Measurementmentioning
confidence: 99%
“…As we see an increasing number of studies that explore applying or extending ML methods to solving measurement problems, we hope to call the attention of the measurement community to identify, discuss, and work together to address the risks. This concern is shared by some of the articles we reviewed (e.g., Alexander III et al, 2020;Anderson et al, 2020;Higgins & Heilman, 2014;Rauthmann, 2020;Rotou & Rupp, 2020;Stachl et al, 2020).…”
Section: Risks Associated With Involving ML In Measurementmentioning
confidence: 95%
“…The other four articles include examples of new processes for validating ML-enhanced measurement. Lottridge et al (2020) and Rotou and Rupp (2020) suggested practical processes for evaluating automated scoring systems. Speer (2021) investigated the validity and generalizability evidence for scores generated using natural language processing (NLP) algorithms.…”
Section: Risks Associated With Involving ML In Measurementmentioning
confidence: 99%
“…As substitutes for human raters, AES systems utilize computer technology to evaluate and score written prose. At the very minimum, AES systems provide consistency in scoring essays, and are not time-consuming for scoring routine, written-language assignments (Ifenthaler & Dikli, 2015; Rotou & Rupp, 2020).…”
Section: Introductionmentioning
confidence: 99%