“…Advances in natural language processing (NLP) have produced improved automated assessment scoring approaches to support teaching and learning (e.g., Adair et al 2023;Wilson et al 2021). Proposed methodologies include data augmentation , next sentence prediction (Wu et al 2023), prototypical neural networks (Zeng et al 2023), cross-prompt fine-tuning (Funayama et al 2023), human-in-the-loop scoring via sam-pling responses (Singla et al 2022), and reinforcement learning (Liu et al 2022). While these methods have enjoyed varying degrees of success, a majority of these applications have targeted more structured mathematics and computer science tasks (i.e., tasks that can be solved formulaically), but their grading is different from scoring free-form shortanswer responses by middle school students in science domains.…”