An increasing number of studies on the use of tools for automated writing evaluation (AWE) in writing classrooms suggest growing interest in their potential for formative assessment. As with all assessments, these applications should be validated in terms of their intended interpretations and uses. A recent argument-based validation framework outlined inferences that require backing to support integration of one AWE tool, Criterion, into a college-level English as a Second Language (ESL) writing course. The present research appraised evidence for the assumptions underlying two inferences in this argument. In the first of two studies, we assessed evidence for the evaluation inference, which includes the assumption that Criterion provides students with accurate feedback. The second study focused on the utilisation inference involving the assumption that Criterion feedback is useful for students to make decisions about revisions. Results showed accuracy varied considerably across error types, as did students' abilities to use Criterion feedback to correct written errors. The findings can inform discussion of whether and how to integrate the use of AWE into writing classrooms while raising important questions regarding standards for validation of AWE as formative assessment, Criterion developers' approach to accuracy, and instructors' assumptions about the underlying purposes of AWE-based writing activities.
KeywordsAcademic writing, argument-based validation, automated writing evaluation, ESL, formative assessment
AWE FOR FORMATIVE ASSESSMENT 2 AbstractAn increasing number of studies on the use of tools for automated writing evaluation (AWE) in writing classrooms suggests growing interest in their potential for formative assessment. As with all assessments, these applications should be validated in terms of their intended interpretations and uses (Kane, 2012). A recent argument-based validation framework outlined inferences that require backing to support integration of one AWE tool, Criterion, into a college-level ESL writing course. The present research appraised evidence for the assumptions underlying two inferences in this argument. In the first of two studies, we assessed evidence for the evaluation inference, which includes the assumption that Criterion provides students with accurate feedback. The second study focused on the utilization inference involving the assumption that Criterion feedback is useful for students to make decisions about revisions. Results showed accuracy varied considerably across error types, as did students' abilities to use Criterion feedback to correct written errors. The findings can inform discussion of whether and how to integrate the use of AWE into writing classrooms while raising important questions regarding standards for validation of AWE as formative assessment, Criterion developers' approach to accuracy, and instructors' assumptions about the underlying purposes of AWE-based writing activities.