The idea of optimizing experimental design to give estimators maximal efficiency has been around in the statistical literature for several decades, but its applicability to sampling problems in item response theory (IRT) has not been widely noticed. It is the purpose of this paper to show how optimum design principles C&ll be used to improve item &lid examinee sampling in IRT-based test assembly and item calibration. For both applications a result based on the maximin principle is given. The maximin principle fits these applications naturally, because IRT models are nonlinear and involve criteria of optimality that are dependent on the unknown parameters.
Optimum Design in IRT: Applications to Test Assembly and Item CalibrationThe main topics addressed in statistics are parameter estimation and hypothesis testing. For both topics the literature has produced fruitful methods of finding estimators and test statistics as well as important criteria to evaluate their performances. Most statistical theory is based on the assumption of simple random sampling, and it seems safe to assert that the majority of the statisticians have hardly any interest in sampling beyond this assumption. Two exceptions to this practice are known, though. One is the interest in more complicated sampling procedures than simple random sampling, notably in the domain of survey research (Kalton, 1983; Sarndal, Swensson, and Wretman, 1992). Another exception began with the pioneering work on optimum design of statistical experiments by R. A. Fisher, who, ironically, can also be considered the founder of mainstream statistics. Fisher's work originated in the domain of linear models. Linear models have the important aspect that they focus on the outcomes of statistical experiments in which the statistician may have control over some of the variables, and hence is able to design an experiment in which sampling with respect to estimation of the parameters is optimal. An example is the experiment underlying the estimation of the parameters in a bivariate linear regression model in which the predictor is a fixed variable and the statistician can select the levels of this predictor. Likewise, if the predictor is a random variable, the statistician may have control of the probabilities with which the levels of the lUniversity of Twente,