A logistic regression model for characterizing differential item functioning (DIF) between two groups is presented. A distinction is drawn between uniform and nonuniform DIF in terms of the parameters of the model. A statistic for testing the hypothesis of no DIF is developed. Through simulation studies, it is shown that the logistic regression procedure is more powerful than the Mantel-Haenszel procedure for detecting nonuniform DIF and as powerful in detecting uniform DIF.
The Mantel-Haenszel (MH) procedure is sensitive to only one type of differential item functioning (DIF). It is not designed to detect DIF that has a nonuniform effect across trait levels. By generalizing the model underlying the MH procedure, a more general DIF detection procedure has been developed (Swaminathan & Rogers, 1990). This study compared the performance of this procedure—the logistic regression (LR) procedure—to that of the MH procedure in the detection of uniform and nonuniform DIF in a simulation study which examined the distributional properties of the LR and MH test statistics and the relative power of the two procedures. For both the LR and MH test statistics, the expected distributions were obtained under nearly all conditions. The LR test statistic did not have the expected distribution for very difficult and highly discriminating items. The LR procedure was found to be more powerful than the MH procedure for detecting nonuniform DIF and as powerful in detecting uniform DIF. Index terms: differential item functioning, logistic regression, Mantel-Haenszel statistic, nonuniform DIF, uniform DIF. The Mantel-Haenszel (MH) procedure is currently one of the most popular procedures for detecting differential item functioning (DIF). The primary reasons for its popularity include its computational simplicity, ease of implementation, and associated test of statistical significance. These advantages, however, are obtained at the cost of some generality. The MH procedure is designed to detect uniform DIF and may not be sensitive to nonuniform DIF. Uniform DIF exists if there is no interaction be
This study compared three procedures—the Mantel- Haenszel (MH), the simultaneous item bias (SIB), and the logistic regression (LR) procedures—with respect to their Type I error rates and power to detect nonuniform dif ferential item functioning (DIF). Data were simulated to reflect a variety of conditions: The factors manipulated included sample size, ability distribution differences between the focal and the reference groups, proportion of DIF items in the test, DIF effect sizes, and type of item. 384 conditions were studied. Both the SIB and LR proce dures were equally powerful in detecting nonuniform DIF under most conditions. The MH procedure was not very effective in identifying nonuniform DIF items that had disordinal interactions. The Type I error rates were within the expected limits for the MH procedure and were higher than expected for the SIB and LR proce dures ; the SIB results showed an overall increase of approximately 1% over the LR results. Index terms: differential item functioning, logistic regression statistic, Mantel-Haenszel statistic, nondirectional DIF, simulta neous item bias statistic, SIBTEST, Type I error rate, unidirectional DIF.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.