2007
DOI: 10.3200/jexe.75.4.293-316
|View full text |Cite
|
Sign up to set email alerts
|

Empirical Bayes Versus Standard Mantel-Haenszel Statistics for Detecting Differential Item Functioning Under Small Sample Conditions

Abstract: In this study, the authors assess several strategies created on the basis of the Mantel-Haenszel (MH) procedure for conducting differential item functioning (DIF) analysis with small samples. One of the analytical strategies is a loss function (LF) that uses empirical Bayes Mantel-Haenszel estimators, whereas the other strategies use the classical MH statistics (the MH chisquare statistic using high levels of significance [0.20] or empirical criteria on the basis of the magnitude of the MH-delta estimator). Th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
5
0

Year Published

2008
2008
2023
2023

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 36 publications
0
5
0
Order By: Relevance
“…In these cases, the whole power values were coded and were assumed to be associated with the same Type I error rate. For example, Fidalgo, Hashimoto, Bartram, and Muñiz (2007) classified the values corresponding to the statistical power in detecting items as A, B, or C, whereas they only presented one error rate associated with these three values…”
Section: Methodsmentioning
confidence: 99%
“…In these cases, the whole power values were coded and were assumed to be associated with the same Type I error rate. For example, Fidalgo, Hashimoto, Bartram, and Muñiz (2007) classified the values corresponding to the statistical power in detecting items as A, B, or C, whereas they only presented one error rate associated with these three values…”
Section: Methodsmentioning
confidence: 99%
“…Nevertheless, even when the above assumptions are not fulfilled and, for example, the test items follow the threeparameter logistic model (3PLM), the MH procedure has proved to be effective in a wide variety of situations (Allen & Donoghue, 1996;Donoghue et al, 1993;Roussos & Stout, 1996;Shealy & Stout, 1993;Uttaro & Millsap, 1994). Also well established is its high power for detecting uniform DIF and its sensitivity for detecting some types of nonuniform DIF (Hidalgo & López-Pina, 2004;Mazor, Clauser, & Hambleton, 1994;Narayanan & Swaminathan, 1996;Rogers & Swaminathan, 1993), its capacity for detecting DIF with small sample sizes (Camilli & Smith, 1990;Fidalgo, Ferreres, & Muñiz, 2004;Fidalgo, Hashimoto, Bartram, & Muñiz, 2007;Mazor, Clauser, & Hambleton, 1992;Muñiz, Hambleton, & Xing, 2001;Parshall & Miller, 1995), and the advantages of using two-stage or iterative purification procedures for purifying the matching variable (Clauser, Mazor, & Hambleton, 1993;Fidalgo, Mellenbergh, & Muñiz, 2000;Narayanan & Swaminathan, 1994Rogers & Swaminathan, 1993;Wang & Su, 2004a). It is also fair to point out that substantial latent trait distribution differences of the reference and focal groups yield a highly inflated Type I error (Clauser et al, 1993;Donoghue et al, 1993;Uttaro & Millsap, 1994;Zwick, 1990), especially when the procedure is applied to items that do not fit the 1PLM (Penny & Johnson, 1999).…”
mentioning
confidence: 99%
“…Estimations based on expert reviews are needed to claim bias on any item that is specified to flag DIF as a result of the statistical analysis (Camilli and Shepard, 1994;Zumbo, 1999). Notwithstanding the differences in literature with regards to the sample sizes of DIF studies in polytomous items, Wood (2011) defined a small sample size to be 40 individuals, while Fidalgo, Hashimoto, Bartram, and Muñiz (2007) and Muñiz, Hambleton and Xing (2001) defined a small sample size to be 50 individuals per group.…”
Section: Discussionmentioning
confidence: 99%