A Framework for Evaluation and Use of Automated Scoring

Williamson, David M.; Xi, Xiaoming; Breyer, F. Jay

doi:10.1111/j.1745-3992.2011.00223.x

Cited by 261 publications

(279 citation statements)

References 31 publications

Supporting

Mentioning

259

Contrasting

Unclassified

Order By: Relevance

“…Assessment experts characterize this as a "low-stakes" application of AWE (Chapelle & Chung, 2010;Williamson, Xi, & Breyer, 2012;Weigle, 2013a), but for students and instructors, time, effort, and funding are limited resources, and decisions about where to invest them are not inconsequential. More work is needed not only to address concerns about the pedagogical value of AWE feedback but to ensure quality experiences for end users, particularly those working in a second or foreign language.…”

Section: Resultsmentioning

confidence: 99%

Automated writing evaluation for formative assessment of second language writing: investigating the accuracy and usefulness of feedback as part of argument-based validation

Ranalli

Link

Chukharev‐Hudilainen

2016

Educational Psychology

102

View full text Add to dashboard Cite

An increasing number of studies on the use of tools for automated writing evaluation (AWE) in writing classrooms suggest growing interest in their potential for formative assessment. As with all assessments, these applications should be validated in terms of their intended interpretations and uses. A recent argument-based validation framework outlined inferences that require backing to support integration of one AWE tool, Criterion, into a college-level English as a Second Language (ESL) writing course. The present research appraised evidence for the assumptions underlying two inferences in this argument. In the first of two studies, we assessed evidence for the evaluation inference, which includes the assumption that Criterion provides students with accurate feedback. The second study focused on the utilisation inference involving the assumption that Criterion feedback is useful for students to make decisions about revisions. Results showed accuracy varied considerably across error types, as did students' abilities to use Criterion feedback to correct written errors. The findings can inform discussion of whether and how to integrate the use of AWE into writing classrooms while raising important questions regarding standards for validation of AWE as formative assessment, Criterion developers' approach to accuracy, and instructors' assumptions about the underlying purposes of AWE-based writing activities. KeywordsAcademic writing, argument-based validation, automated writing evaluation, ESL, formative assessment AWE FOR FORMATIVE ASSESSMENT 2 AbstractAn increasing number of studies on the use of tools for automated writing evaluation (AWE) in writing classrooms suggests growing interest in their potential for formative assessment. As with all assessments, these applications should be validated in terms of their intended interpretations and uses (Kane, 2012). A recent argument-based validation framework outlined inferences that require backing to support integration of one AWE tool, Criterion, into a college-level ESL writing course. The present research appraised evidence for the assumptions underlying two inferences in this argument. In the first of two studies, we assessed evidence for the evaluation inference, which includes the assumption that Criterion provides students with accurate feedback. The second study focused on the utilization inference involving the assumption that Criterion feedback is useful for students to make decisions about revisions. Results showed accuracy varied considerably across error types, as did students' abilities to use Criterion feedback to correct written errors. The findings can inform discussion of whether and how to integrate the use of AWE into writing classrooms while raising important questions regarding standards for validation of AWE as formative assessment, Criterion developers' approach to accuracy, and instructors' assumptions about the underlying purposes of AWE-based writing activities.

show abstract

Section: Resultsmentioning

confidence: 99%

Automated writing evaluation for formative assessment of second language writing: investigating the accuracy and usefulness of feedback as part of argument-based validation

Ranalli

Link

Chukharev‐Hudilainen

2016

Educational Psychology

102

View full text Add to dashboard Cite

show abstract

“…Construct validity and its association with instructional activities have also fallen within the scope of AES system-centric research (Attali & Burstein, 2006;Page, Keith, & Lavoie, 1995). In the past decade, an important trend examining AES within the larger context of the argument-based approach to test validation (Kane, 1992(Kane, , 2006 has focused on applying, refining, and expanding conceptual validation frameworks to particular applications of automated scoring (e.g., Bennett & Bejar, 1998;Xi, 2008;Williamson, Xi, & Breyer, 2012).…”

Section: Aes For Assessment and Awe For Instructionmentioning

confidence: 99%

Automated Writing Analysis for Writing Pedagogy

Cotos

2015

Writing & Pedagogy

View full text Add to dashboard Cite

This article aims to engage specialists in writing pedagogy, assessment, genre study, and educational technologies in a constructive dialog and joint exploration of automated writing analysis as a potent instantiation of computer-enhanced assessment for learning. It recounts the values of writing pedagogy and, from this perspective, examines legitimate concerns with automated writing analysis. Emphasis is placed on the need to substantiate the construct-driven debate with systematic empirical evidence that would corroborate or refute interpretations, uses, and consequences of automated scoring and feedback tools intended for specific contexts. Such evidence can be obtained by adopting a validity argument framework. To demonstrate an application of this framework, the article presents a novel genre-based approach to automated analysis configured to support research writing and provides examples of validity evidence for using it with novice scholarly writers. Automated Writing Analysis for writing pedagogy: From healthy tension to tangible prospects Elena Cotos AbstractThis article aims to engage specialists in writing pedagogy, assessment, genre study, and educational technologies in a constructive dialog and joint exploration of automated writing analysis as a potent instantiation of computer-enhanced assessment for learning. It recounts the values of writing pedagogy and, from this perspective, examines legitimate concerns with automated writing analysis. Emphasis is placed on the need to substantiate the construct-driven debate with systematic empirical evidence that would corroborate or refute interpretations, uses, and consequences of automated scoring and feedback tools intended for specific contexts. Such evidence can be obtained by adopting a validity argument framework. To demonstrate an application of this framework, the article presents a novel genre-based approach to automated analysis configured to support research writing and provides examples of validity evidence for using it with novice scholarly writers.

show abstract

“…There is a large body of literature with regards to ATS systems of text produced by nonnative English-language learners (Page, 1968;Attali and Burstein, 2006;Rudner and Liang, 2002;Elliot, 2003;Landauer et al, 2003;Briscoe et al, 2010;Yannakoudakis et al, 2011;Sakaguchi et al, 2015, among others), overviews of which can be found in various studies (Williamson, 2009;Dikli, 2006;Shermis and Hammer, 2012). Implicitly or explicitly, previous work has primarily treated text scoring as a supervised text classification task, and has utilized a large selection of techniques, ranging from the use of syntactic parsers, via vectorial semantics combined with dimensionality reduction, to generative and discriminative machine learning.…”

Section: Introductionmentioning

confidence: 99%

Automatic Text Scoring Using Neural Networks

Alikaniotis¹,

Yannakoudakis²,

Rei³

2016

Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

203

182

View full text Add to dashboard Cite

Automated Text Scoring (ATS) provides a cost-effective and consistent alternative to human marking. However, in order to achieve good performance, the predictive features of the system need to be manually engineered by human experts. We introduce a model that forms word representations by learning the extent to which specific words contribute to the text's score. Using Long-Short Term Memory networks to represent the meaning of texts, we demonstrate that a fully automated framework is able to achieve excellent results over similar approaches. In an attempt to make our results more interpretable, and inspired by recent advances in visualizing neural networks, we introduce a novel method for identifying the regions of the text that the model has found more discriminative.

show abstract

A Framework for Evaluation and Use of Automated Scoring

Cited by 261 publications

References 31 publications

Automated writing evaluation for formative assessment of second language writing: investigating the accuracy and usefulness of feedback as part of argument-based validation

Automated writing evaluation for formative assessment of second language writing: investigating the accuracy and usefulness of feedback as part of argument-based validation

Automated Writing Analysis for Writing Pedagogy

Automatic Text Scoring Using Neural Networks

Contact Info

Product

Resources

About