The MIMIC Method With Scale Purification for Detecting Differential Item Functioning

Wang, Wen Chung; Shih, Ching-Lin; Yang, Chih Chien

doi:10.1177/0013164409332228

Cited by 55 publications

(82 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…As expected, the major findings are consistent with those found in dichotomous items Wang et al, 2009). The strategy is supported, as demonstrated in the three simulation studies.…”

Section: Conclusion and Discussionsupporting

confidence: 90%

“…To accomplish this, many scale purification procedures have been developed and widely incorporated in DIF assessment methods (Clauser, Mazor, & Hambleton, 1993;French & Maller, 2007;Wang & Su, 2004a, 2004b. Following the principle of scale purification, Wang et al (2009) implemented a scale purification procedure on M-ST, which was called the MIMIC method with scale purification (denoted as M-SP; detailed steps are shown below), and in a series of simulations found that both M-ST and M-SP maintain a well-controlled Type I error rate when tests do not contain DIF items; M-SP outperforms M-ST in controlling the Type I error rate and yielding a higher power of DIF detection when tests contain DIF items, but unfortunately, M-SP begins to yield an inflated Type I error rate and a deflated power when there are 20% or more DIF items in the test. That is, even M-SP cannot guarantee an expected Type I error rate and a high power.…”

mentioning

confidence: 99%

See 1 more Smart Citation

MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

Wang

Shih

2010

Applied Psychological Measurement

Self Cite

View full text Add to dashboard Cite

Three multiple indicators-multiple causes (MIMIC) methods, namely, the standard MIMIC method (M-ST), the MIMIC method with scale purification (M-SP), and the MIMIC method with a pure anchor (M-PA), were developed to assess differential item functioning (DIF) in polytomous items. In a series of simulations, it appeared that all three methods yielded a wellcontrolled Type I error rate when tests did not contain any DIF items. M-ST and M-SP began to yield an inflated Type I error rate and a deflated power when tests contained 10% and 20% DIF items, respectively. M-PA maintained an expected Type I error rate and a high power even when tests contained as many as 40% DIF items. An iterative MIMIC procedure was proposed to select a small set of DIF-free items to serve as the anchor in M-PA. It was found in a series of simulations that this procedure yielded a very high rate of accuracy. Two simulated data sets were then analyzed to show applications of these MIMIC methods for DIF assessment in polytomous items.

show abstract

“…As expected, the major findings are consistent with those found in dichotomous items Wang et al, 2009). The strategy is supported, as demonstrated in the three simulation studies.…”

Section: Conclusion and Discussionsupporting

confidence: 90%

mentioning

confidence: 99%

MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

Wang

Shih

2010

Applied Psychological Measurement

Self Cite

View full text Add to dashboard Cite

show abstract

“…However, a carefully designed purification procedure needs to be the first step for identifying potential DIF items when conducting DIF analyses with real data. In the literature, different anchor purification methods have been suggested to select DIF-free items for different DIF detection approaches (e.g., French and Maller, 2007;Wang et al, 2009;Woods, 2009b;Gonzalez-Betanzos and Abad, 2012). Depending on the selection of DIF-free items (i.e., purification), the DIF detection methods may provide different results regarding the number and type of detected DIF items.…”

Section: Limitations and Future Researchmentioning

confidence: 99%

Detecting Multidimensional Differential Item Functioning with the Multiple Indicators Multiple Causes Model, the Item Response Theory Likelihood Ratio Test, and Logistic Regression

Bulut

Suh

2017

Front. Educ.

View full text Add to dashboard Cite

Differential item functioning (DIF) is typically evaluated in educational and psychological assessments with a simple structure in which items are associated with a single latent trait. This study aims to extend the investigation of DIF for multidimensional assessments with a non-simple structure in which items can be associated with two or more latent traits. A simulation study was conducted with the multidimensional extensions of the item response theory likelihood ratio (IRT-LR) test, the multiple indicators multiple causes (MIMIC) model, and logistic regression for detecting uniform and non-uniform DIF in multidimensional assessments. The results indicated that the IRT-LR test outperformed the MIMIC and logistic regression approaches in detecting non-uniform DIF. When detecting uniform DIF, the MIMIC and logistic regression approaches appeared to perform better than the IRT-LR test in short tests, while the performances of all three approaches were very similar in longer tests. Type I error rates for logistic regression were severely inflated compared with the other two approaches. The IRT-LR test appears to be a more balanced and powerful method than the MIMIC and logistic regression approaches in detecting DIF in multidimensional assessments with a non-simple structure.

show abstract

“…In practice, it is possible that multiple items can be DIF-present within a short test. Although item purification procedures can be conducted in advance (Wang, Shih, & Yang, 2009) and the purified covariate (i.e., total score computed after excluding DIF-present items) can be used in HLR-LC to match θ between groups, with tests being short, the purified covariate might be difficult to cover a wide range of θ. One possible solution is to use multiple indicators multiple causes (MIMIC) model, which is robust against DIF contamination (Finch, 2005).…”

Section: Discussionmentioning

confidence: 99%

DIF Analysis with Multilevel Data: A Simulation Study Using the Latent Variable Approach

Jin

Eason

2016

JEI

View full text Add to dashboard Cite

The effects of mean ability difference (MAD) and short tests on the performance of various DIF methods have been studied extensively in previous simulation studies. Their effects, however, have not been studied under multilevel data structure. MAD was frequently observed in large-scale cross-country comparison studies where the primary sampling units were more likely to be clusters (e.g., schools). With short tests, regular DIF methods under MAD-present conditions might suffer from inflated type I error rate due to low reliability of test scores, which would adversely impact the matching ability of the covariate (i.e., the total score) in DIF analysis. The current study compared the performance of three DIF methods: logistic regression (LR), hierarchical logistic regression (HLR) taking multilevel structure into account, and hierarchical logistic regression with latent covariate (HLR-LC) taking multilevel structure into account as well as accounting for low reliability and MAD. The results indicated that HLR-LC outperformed both LR and HLR under most simulated conditions, especially under the MAD-present conditions when tests were short. Practical implications of the implementation of HLR-LC were also discussed.

show abstract

The MIMIC Method With Scale Purification for Detecting Differential Item Functioning

Cited by 55 publications

References 28 publications

MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

MIMIC Methods for Assessing Differential Item Functioning in Polytomous Items

Detecting Multidimensional Differential Item Functioning with the Multiple Indicators Multiple Causes Model, the Item Response Theory Likelihood Ratio Test, and Logistic Regression

DIF Analysis with Multilevel Data: A Simulation Study Using the Latent Variable Approach

Contact Info

Product

Resources

About