2008
DOI: 10.1111/j.1745-3984.2008.00071.x
|View full text |Cite
|
Sign up to set email alerts
|

Performance of the Generalized S‐X2 Item Fit Index for Polytomous IRT Models

Abstract: Orlando and Thissen's S‐X 2 item fit index has performed better than traditional item fit statistics such as Yen's Q1 and McKinley and Mill's G2 for dichotomous item response theory (IRT) models. This study extends the utility of S‐X 2 to polytomous IRT models, including the generalized partial credit model, partial credit model, and rating scale model. The performance of the generalized S‐X 2 in assessing item model fit was studied in terms of empirical Type I error rates and power and compared to G2. The res… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
61
0
2

Year Published

2015
2015
2023
2023

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 123 publications
(63 citation statements)
references
References 33 publications
0
61
0
2
Order By: Relevance
“…As a summed score-based method, S – X 2 was found to perform better than the traditional Chi-square-type item-fit statistics as well as to yield acceptable power and maintain a nominal Type I error rate for detecting misfit with normally distributed data [31, 32]. However, the performance of S – X 2 in detecting misfit can be influenced by the data distribution and test length [32, 33]. While statistical tests to detect misfit can be sensitive to the aforementioned factors, another direction for assessing item fit is the use of graphical approaches, which is strongly advocated by Hambleton and Han [5].…”
Section: Discussionmentioning
confidence: 99%
“…As a summed score-based method, S – X 2 was found to perform better than the traditional Chi-square-type item-fit statistics as well as to yield acceptable power and maintain a nominal Type I error rate for detecting misfit with normally distributed data [31, 32]. However, the performance of S – X 2 in detecting misfit can be influenced by the data distribution and test length [32, 33]. While statistical tests to detect misfit can be sensitive to the aforementioned factors, another direction for assessing item fit is the use of graphical approaches, which is strongly advocated by Hambleton and Han [5].…”
Section: Discussionmentioning
confidence: 99%
“…For instance, the 3PL estimates three item parameters resulting in df = H − 3. Yen's Q 1 and Bock's χ 2 are not standard output in any commercial IRT software packages, despite being presented in a range of widely cited sources (see Kang & Chen, ; Orlando & Thissen, ; Reise, ; Sinharay, ; Stone & Zhang, ).…”
Section: Traditional Methods For Evaluating Model‐data Fitmentioning
confidence: 99%
“…IRFs that are a good fit to the data can yield appropriate inferences and predictions. IRFs that do not demonstrate good fit to the data run the risk of several undesirable outcomes, including biased ability and item parameter estimates (Wainer & Thissen, ; Yen, ) that jeopardize the appropriate application of IRT models in such areas as test development, equating, and computer adaptive testing (Kang & Chen, ). The consideration of model‐data fit is an important step in test development (see Standard 3.9 of the Standards for Educational and Psychological Testing ; AERA/APA/NCME, ), with misfitting items often being discarded from the potential item pool (Sinharay, ; Wilson, ).…”
mentioning
confidence: 99%
“…Information content of items were calculated using Fisher information which is formulated as minus the expectation of the second derivative of the log-likelihood of the model [28]. To evaluate the item fit, the generalized Orlando and Thissen's S-X 2 index for polytomous data was used [35], comparing the observed and expected response frequencies under the estimated MIRT model. Eventually items with S-X 2 p-value<0.001 were considered poorly fitted [36].…”
Section: Discussionmentioning
confidence: 99%