In this manuscript, the applicability of the Hausman test to the evaluation of item response models is investigated. The Hausman test is a general test of model fit. The test assesses whether for a model in question the parameter estimates of two different estimators coincide. The test can be implemented for item response models by comparing the parameter estimates of the marginal maximum likelihood estimator with the corresponding parameter estimates of a limited information estimator. For a correctly specified item response model, the difference of the two estimates is normally distributed around zero. The Hausman test can be used for the evaluation of item fit and global model fit. The performance of the test is evaluated in a simulation study. The simulation study suggests that the implemented versions of the test adhere to the nominal Type-I error rate well in samples of 1000 test takers and more. The test is also capable to detect misspecified item characteristic functions, but lacks power to detect violations of the conditional independence assumption. Keywords: item response theory, 2-PL model, model fit, item fit, Hausman test
ANALYZING THE FIT OF IRT MODELS WITH THE HAUSMAN TESTItem response models are measurement models that allow inferring individual traits from responses given to the items of a standardized test. Core of item response models are precise assumptions about the relation between the traits and the response in a single item and the interrelation of the responses from different items. These assumptions then serve as a mathematical basis for deducing statements about a test taker's traits from his/her responses. The correctness of such inferential statements depends crucially on the validity of the item response model. As the results of psychological assessment often have important consequences, one has to guarantee that the conclusions drawn about the test takers have a sound basis. Therefore, it is indispensable to check the adequacy of the chosen item response model and its assumptions carefully. Such a check requires a powerful test of model fit.Several tests of model fit have been proposed in the past. A short overview over of the different tests is given in the following section. In doing so, the focus is mainly on the two-parameter logistic model. Nothing will be said about tests that were proposed exclusively for the Rasch model and cannot be used in general; for such tests see Glas and Verhelst (1995), Suaréz Falcón and Glas (2003), and Maydeu-Olivares and Montaño (2013. The review does also not cover the general approaches used in non-parametric item response theory (Sijtsma, 1998) or tests within the Bayesian framework (Sinharay, 2016). Tests of differential item functioning will also not be addressed (Magis et al., 2010). Having given this overview, an alternative test of model fit is Frontiers in Psychology | www.frontiersin.org 1 February 2020 | Volume 11 | Article 149 Ranger and Much Hausman Test for IRT Models Frontiers in Psychology | www.frontiersin.org