a Faculty of Behavioral, management and social sciences (Bms), university of twente, enschede, the netherlands; b citolab, cito institute for educational measurement, Arnhem, the netherlands; c Psychometric research centre, cito institute for educational measurement, Arnhem, the netherlands ABSTRACT This study investigated (1) the extent to which presentations of measurement error in score reports influence teachers' decisions and (2) teachers' preferences in relation to these presentations. Three presentation formats of measurement error (blur, colour value and error bar) were compared to a presentation format that omitted measurement error. The results from a factorial survey analysis showed that the position of a score in relation to a cut-off score impacted most significantly on decisions. Moreover, the teachers (N = 337) indicated the need for additional information significantly more often when the score reports included an error bar compared to when they omitted measurement error. The error bar was also the most preferred presentation format. The results were supported in thinkaloud protocols and focus groups, although several interpretation problems and misconceptions of measurement error were identified.
In educational practice, test results are used for several purposes. However, validity research is especially focused on the validity of summative assessment. This article aimed to provide a general framework for validating formative assessment. The authors applied the argument‐based approach to validation to the context of formative assessment. This resulted in a proposed interpretation and use argument consisting of a score interpretation and a score use. The former involves inferences linking specific task performance to an interpretation of a student's general performance. The latter involves inferences regarding decisions about actions and educational consequences. The validity argument should focus on critical claims regarding score interpretation and score use, since both are critical to the effectiveness of formative assessment. The proposed framework is illustrated by an operational example including a presentation of evidence that can be collected on the basis of the framework.
Increasing technological possibilities encourage test developers to modernize and improve computer-based assessments. However, from a validity perspective, these innovations might both strengthen and weaken the validity of test scores. In this theoretical chapter, the impact of technological advancements is discussed in the context of the argument-based approach to validity. It is concluded that the scoring and generalization inference are of major concern when using these innovative techniques. Also, the use of innovative assessment tasks, such as simulations, multi-media enhanced tasks or hybrid assessment tasks is quite double-edged from a validity point of view: it strengthens the extrapolation inference, but weakens the scoring, generalization and decision inference.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.