For some students, standardized tests serve as a conduit to disclose sensitive issues of harm or distress that may otherwise go unreported. By detecting this writing, known as crisis papers, testing programs have a unique opportunity to assist in mitigating the risk of harm to these students. The use of machine learning to automatically detect such writing is necessary in the context of online tests and automated scoring. To achieve a detection system that is accurate, humans must first consistently label the data that are used to train the model. This paper argues that the existing guidelines are not sufficient for this task and proposes a three-level rubric to guide the collection of the training data. In showcasing the fundamental machine learning procedures for creating an automatic text classification system, the following evidence emerges as support of the operational use of this rubric. First, hand-scorers largely agree with one another in assigning labels to text according to the rubric. Additionally, when this labeled data are used to train a baseline classifier, the model exhibits promising performance. Recommendations are made for improving the hand-scoring training process, with the ultimate goal of quickly and accurately assisting students in crisis.
The process of setting and evaluating student learning objectives (SLOs) has become increasingly popular as an example where classroom assessment is intended to fulfill the dual purpose use of informing instruction and holding teachers accountable. A concern is that the high‐stakes purpose may lead to distortions in the inferences about students and teachers that SLOs can support. This concern is explored in the present study by contrasting student SLO scores in a large urban school district to performance on a common objective external criterion. This external criterion is used to evaluate the extent to which student growth scores appear to be inflated. Using 2 years of data, growth comparisons are also made at the teacher level for teachers who submit SLOs and have students that take the state‐administered large‐scale assessment. Although they do show similar relationships with demographic covariates and have the same degree of stability across years, the two different measures of growth are weakly correlated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.