2019
DOI: 10.1007/s11336-018-9639-4
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Scores: An Alternative to Parametric Item Response Theory and Sum Scores

Abstract: The aim of this paper is to discuss nonparametric item response theory scores in terms of optimal scores as an alternative to parametric item response theory scores and sum scores. Optimal scores take advantage of the interaction between performance and item impact that is evident in most testing data. The theoretical arguments in favor of optimal scoring are supplemented with the results from simulation experiments, and the analysis of test data suggests that sum-scored tests would need to be longer than an o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 15 publications
(10 citation statements)
references
References 13 publications
0
10
0
Order By: Relevance
“…The lower dimensional curves defined by models such as the two-parameter graded response and partial credit models have the interpretational advantage of having the parameters a and b that directly reflect what we call sensitivity and location, but the disadvantage of being unable to capture more complex variation such as we see in Figures 8 and 9. We believe, however, that these and other simple IRT models would also deliver significant improvement in intensity estimation accuracy over that provided by the sum score, and have already reported on this in in the multiple choice testing context in [15].…”
Section: Discussionmentioning
confidence: 81%
See 1 more Smart Citation
“…The lower dimensional curves defined by models such as the two-parameter graded response and partial credit models have the interpretational advantage of having the parameters a and b that directly reflect what we call sensitivity and location, but the disadvantage of being unable to capture more complex variation such as we see in Figures 8 and 9. We believe, however, that these and other simple IRT models would also deliver significant improvement in intensity estimation accuracy over that provided by the sum score, and have already reported on this in in the multiple choice testing context in [15].…”
Section: Discussionmentioning
confidence: 81%
“…Further information on splines and curve estimating with spline can be found in [13,14]. A comparison of results using two-parameter logistic curves and using splines curves for correct responses for a large-scale multiple choice test is [15]. The bottom panel of Figure 8 displays the surprisal curves in 5-bit units for a heterogeneous graded response model corresponding to the probability curves in the top panels.…”
Section: The Multinomial Surprisal Vectormentioning
confidence: 99%
“…proposed by Arenson and Karabatsos (2018), which uses no specific parametric function as base distribution, was able to perform better than the parametric 2PL model, especially when symmetric priors, as the ones used in the present study, are used. Finally, these models can all be compared to sum scores, which can be considered as lower bound benchmarks for the performance of the models (Wiberg, Ramsay, & Li, 2018).…”
Section: Procedures' Performance and Hypothesismentioning
confidence: 99%
“…In recent years, flexible item response models have increasingly been used in confirmatory contexts. For example, flexible item response models have recently been applied to computerized adaptive testing [7,8], the creation of item banks for measuring health outcomes [9], and the development of optimal scoring procedures [10].…”
Section: Introductionmentioning
confidence: 99%