Evidence-centered assessment design (ECD) is an approach to constructing educational assessments in terms of evidentiary arguments. This paper provides an introduction to the basic ideas of ECD, as well as some of the terminology and models that have been developed to implement the approach. In particular, it presents the high-level models of the Conceptual Assessment Framework (CAF) and the Four-process Delivery Architecture for assessment delivery systems. Special attention is given to the role of probability-based reasoning in accumulating evidence across task performances, in terms of belief about unobservable variables that characterize the knowledge, skills, and/or abilities of students. This is the role traditionally associated with psychometric models, such as those of item response theory (IRT) and latent class models. To unify the ideas and to provide a foundation for extending probability-based reasoning in assessment applications more broadly, however, a more general expression in terms of graphical models is indicated. This brief overview of evidence-centered design provides the reader with a feel for where and how graphical models fit into the larger enterprise of educational and psychological assessment. A simple example based on familiar large-scale standardized tests such as the Graduate Record Examinations ® (GRE ® ) is used to fix ideas.Key words: Assessment design, delivery system, evidence, psychometrics i iii
OverviewWhat all educational assessments have in common is the desire to reason from particular things students say, do, or make, to broader inferences about their knowledge and abilities. Over the past century, a number of assessment methods have evolved for addressing this problem in a principled and systematic manner. The measurement models of classical test theory and, more recently, item response theory (IRT) and latent class analysis, have proved quite satisfactory for the large scale tests and classroom quizzes with which every reader is by now quite familiar.But off-the-shelf assessments and standardized tests are increasingly unsatisfactory for guiding learning and evaluating students' progress. Advances in cognitive and instructional sciences stretch our expectations about the kinds of knowledge and skills we want to develop in students and the kinds of observations we need to evidence them (Glaser, Lesgold, & Lajoie, 1987). Advances in technology make it possible to evoke evidence of knowledge more broadly conceived and to capture more complex performances. One of the most serious bottlenecks we face, however, is making sense of complex data that result.Fortunately, advances in evidentiary reasoning (Schum, 1994) and in statistical modeling (Gelman, Carlin, Stern, & Rubin, 1995) allow us to bring probability-based reasoning to bear on the problems of modeling and uncertainty that arise naturally in all assessments. These advances extend the principles upon which familiar test theory is grounded to more varied and complex inferences from more complex data . One canno...