2016
DOI: 10.1007/s11136-016-1467-3
|View full text |Cite
|
Sign up to set email alerts
|

Impact of IRT item misfit on score estimates and severity classifications: an examination of PROMIS depression and pain interference item banks

Abstract: Purpose In patient-reported outcome research that utilizes item response theory (IRT), using statistical significance tests to detect misfit is usually the focus of IRT model-data fit evaluations. However, such evaluations rarely address the impact/consequence of using misfitting items on the intended clinical applications. This study was designed to evaluate the impact of IRT item misfit on score estimates and severity classifications and to demonstrate a recommended process of model-fit evaluation. Methods… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
12
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
9

Relationship

2
7

Authors

Journals

citations
Cited by 12 publications
(13 citation statements)
references
References 28 publications
1
12
0
Order By: Relevance
“…We further examined the consequence of item misfit on the item and person parameter estimates and found that either including or excluding the nine items yielded nearly identical results. Therefore, as we considered the consequence minor and the misfit tolerable [50], we included all items in the outcome score linking.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…We further examined the consequence of item misfit on the item and person parameter estimates and found that either including or excluding the nine items yielded nearly identical results. Therefore, as we considered the consequence minor and the misfit tolerable [50], we included all items in the outcome score linking.…”
Section: Resultsmentioning
confidence: 99%
“…Such a practice deserves more attention, and it is strongly encouraged in studies on linking PRO measures to ensure the validity of the inferences drawn from the score concordances. Finally, instead of relying solely on chi-square-like IRT fit statistics, which can be sensitive to sample size, we evaluated IRT item misfit by focusing on the consequences of using misfitting items and item statistics associated with them, a strategy strongly recommended by Hambleton and Han [65] and Zhao [50]. We hope that future studies adopting a rigorous approach to addressing methodological issues are encouraged in order to promote the quality of PRO research and to ensure the appropriate application of IRT models.…”
Section: Discussionmentioning
confidence: 99%
“…The authors reported that violating this assumption had little effect on the calculation of these estimates, but the presence of multidimensionality in the data affected the precision of the estimates. In a clinical setting, Zhao (2017) evaluated the impact of item-level misfit on estimates of the severity of respondents’ depression and estimates of the intensity of respondents’ pain levels, as well as respondents’ classifications within clinical categories derived from these estimates. Zhao observed that item misfit did not have substantial practical consequences that affected estimates of respondents’ locations on the latent variable and classification within clinical categories.…”
Section: Evaluating the Practical Consequences Of The Violation Of Itmentioning
confidence: 99%
“…Sinharay and Haberman (2014) studied practical significance of model misfit with various empirical data sets and concluded that the misfit was not always practically significant though evidence of misfit for a substantial number of items was demonstrated. Zhao (2017) investigated the practical impact of item misfit with Patient-Reported Outcome Measurement Information System (PROMIS) depression and pain interference item banks, and suggested that item misfit had a negligible impact on score estimates and severity classifications with the studied sample. Meijer and Tendeiro (2015) analyzed two empirical data sets and examined the effect of removing misfitting items and misfitting item score patterns on the rank order of test takers according to their proficiency level score, and found that the impact of removing misfitting items and item score patterns varied depending on the IRT model applied.…”
Section: Introductionmentioning
confidence: 99%