The Potential for Automated Text Evaluation to Improve the Technical Adequacy of Written Expression Curriculum-Based Measurement

Mercer, Sterett H.; Keller‐Margulis, Milena A.; Faith, Erin L.; Reid, Erin K.; Ochs, Sarah

doi:10.1177/0731948718803296

Cited by 16 publications

(16 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…First, the findings of greater validity and diagnostic accuracy for all scoring approaches when based on three versus one screening sample per student build on prior generalizability theory studies demonstrating that multiple writing samples are needed for adequate reliability (Keller-Margulis et al, 2016;Kim et al, 2017). Second, the findings of comparable validity and diagnostic accuracy across complex (Wilson et al, 2016) and (b) comparable relations to holistic quality for 7 min samples scored with free automated text evaluation and complex WE-CBM metrics (Mercer et al, 2019). These findings illustrate that it may be difficult to improve validity of automated text evaluation beyond complex WE-CBM scoring; instead, investigating the optimal number of writing samples and writing duration may yield more benefits.…”

Section: Discussionmentioning

confidence: 54%

“…Consistent with recent calls for adopting open science practices in special education (Cook et al, 2018), developing and evaluating open-source tools can facilitate critical review and refinement of scoring models, replication efforts by other research teams, and greater usage of tools by removing cost barriers. For example, Mercer et al (2019) examined the validity of scoring models based on Coh-Metrix (Graesser et al, 2014), a free, but proprietary, tool for originally designed to predict reading comprehension difficulty of texts, relative to WE-CBM scores for 7-minute narrative writing samples from students in second through fifth grade. Results indicated that composite scores based on Coh-Metrix could predict raters' holistic quality ratings on the screening samples, both for the samples used to generate the Coh-Metrix scores and on similar samples collected 3 months apart, and that correlations with holistic quality were similar for composites based on Coh-Metrix (r = .73 -.81) and WE-CBM scores (r = .74 -.77).…”

Section: Applying Automated Text Evaluation To We-cbmmentioning

confidence: 99%

“…Coh-Metrix calculates 108 indices in 11 categories: descriptives, text easability principal component scores, referential cohesion, latent semantic analysis, lexical diversity, connectives, situation model, syntactic complexity, syntactic pattern density, word information, and readability. Prior work has found Coh-Metrix indices useful in predicting writing performance in elementary (Mercer et al, 2019) and middle school (Wilson et al, 2017) students.…”

Section: Readerbenchmentioning

confidence: 99%

“…Efforts to improve writing skills hinge on the ability to identify those students who are at-risk for poor performance and to measure their response to changes in instruction (McMaster et al, 2020); however, existing approaches to scoring student writing samples are either too elaborate for use in screening or do not offer the technical adequacy required for decision-making about student risk status (e.g., Gansle et al, 2002;McMaster & Espin, 2007). Advances in automated text evaluation may offer a viable alternative to address these measurement limitations (Mercer et al, 2019). The purpose of this study was to examine and compare the validity and diagnostic accuracy of several automated approaches to scoring written expression screening samples to determine whether automated scoring yields scores with improved technical adequacy.…”

mentioning

confidence: 99%

See 3 more Smart Citations

Validity of Automated Text Evaluation Tools for Written-Expression Curriculum-Based Measurement: A Comparison Study

Keller‐Margulis¹,

Mercer²,

Matta³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

Existing approaches to measuring writing performance are insufficient in terms of both technical adequacy as well as feasibility for use as a screening measure. This study examined the validity and diagnostic accuracy of several approaches to automated essay scoring as well as written expression curriculum-based measurement (WE-CBM) to determine whether an automated approach improves technical adequacy. A sample of 140 fourth grade students generated writing samples that were then scored using traditional and automated approaches and examined with the statewide measure of writing performance. Results indicated that the validity and diagnostic accuracy for the best performing WE-CBM metric, correct minus incorrect word sequences (CIWS) and the automated approaches to scoring were comparable with automated approaches offering potentially improved feasibility for use in screening. Averaging scores across three time points was necessary, however, in order to achieve improved validity and adequate levels of diagnostic accuracy across the scoring approaches. Limitations, implications, and directions for future research regarding the use of automated scoring approaches for screening are discussed.

show abstract

Section: Discussionmentioning

confidence: 54%

Section: Applying Automated Text Evaluation To We-cbmmentioning

confidence: 99%

Section: Readerbenchmentioning

confidence: 99%

mentioning

confidence: 99%

See 2 more Smart Citations

Validity of Automated Text Evaluation Tools for Written-Expression Curriculum-Based Measurement: A Comparison Study

Keller‐Margulis¹,

Mercer²,

Matta³

2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…As the theory developed further, computational tools such as Coh-Metrix were presented to produces indices of the linguistic and discourse representations of pieces of text. Mercer et al [5] studied the validity of automated text evaluation with this tool -using a number of algorithms to train different models-and found it generally valid for written-expression curriculum-based measurement.…”

Section: Related Workmentioning

confidence: 99%

Design, Implementation and Evaluation of a Support System for Educators and Therapists to Rate the Acquisition of Pre-Writing Skills

et al. 2021

View full text Add to dashboard Cite

Assessing the acquisition of pre-writing skills in children with and without special educational needs is a time-consuming task for educators and therapists. It also involves a level of subjectivity, because the same set of strokes may receive different scores from different professionals. We present a system that automates the task by rating the execution of elementary figures (circle, square and triangle) according to the criteria of the Battelle guide for fine motor skills rating. The system uses a neural network trained with a collection of images drawn by 300 children and optimized through a systematic scan of hyperparameters, which revealed that shape signatures are better descriptors than Hu moments. Experiments carried out in collaboration with educators and therapists in Cuenca (Ecuador) provide evidence that the proposed system facilitates their work, automatically providing reliable assessments and in much shorter time than they would need for manual assessment, thus freeing their valuable time for education and therapy tasks.

show abstract

Validity of automated text evaluation tools for written-expression curriculum-based measurement: a comparison study

2021

View full text Add to dashboard Cite

The Potential for Automated Text Evaluation to Improve the Technical Adequacy of Written Expression Curriculum-Based Measurement

Cited by 16 publications

References 45 publications

Validity of Automated Text Evaluation Tools for Written-Expression Curriculum-Based Measurement: A Comparison Study

Validity of Automated Text Evaluation Tools for Written-Expression Curriculum-Based Measurement: A Comparison Study

Design, Implementation and Evaluation of a Support System for Educators and Therapists to Rate the Acquisition of Pre-Writing Skills

Validity of automated text evaluation tools for written-expression curriculum-based measurement: a comparison study

Contact Info

Product

Resources

About