2010
DOI: 10.1111/j.1365-2923.2009.03425.x
|View full text |Cite
|
Sign up to set email alerts
|

A primer on classical test theory and item response theory for assessments in medical education

Abstract: CONTEXT A test score is a number which purportedly reflects a candidate's proficiency in some clearly defined knowledge or skill domain. A test theory model is necessary to help us better understand the relationship that exists between the observed (or actual) score on an examination and the underlying proficiency in the domain, which is generally unobserved. Common test theory models include classical test theory (CTT) and item response theory (IRT). The widespread use of IRT models over the past several deca… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
202
0
12

Year Published

2012
2012
2023
2023

Publication Types

Select...
8
2

Relationship

0
10

Authors

Journals

citations
Cited by 221 publications
(217 citation statements)
references
References 17 publications
3
202
0
12
Order By: Relevance
“…The adequacy of the pool in terms of difficulty and discrimination was evaluated on the basis of the P -value (a measure of item facility), where the recommended range is 30–70 (De Champlain, 2010; Oermann and Gaberson, 2013) and on the basis of the r-PB, where the following cut-offs were used: >0.40 (very good), 0.30–0.39 (reasonably good), 0.20–0.29 (marginally good, in need of improvement), and ≤0.19 (the item must be rejected or improved by revision) (Matlock-Hetzel, 1997; Taib and Yusoff, 2014). …”
Section: Methodsmentioning
confidence: 99%
“…The adequacy of the pool in terms of difficulty and discrimination was evaluated on the basis of the P -value (a measure of item facility), where the recommended range is 30–70 (De Champlain, 2010; Oermann and Gaberson, 2013) and on the basis of the r-PB, where the following cut-offs were used: >0.40 (very good), 0.30–0.39 (reasonably good), 0.20–0.29 (marginally good, in need of improvement), and ≤0.19 (the item must be rejected or improved by revision) (Matlock-Hetzel, 1997; Taib and Yusoff, 2014). …”
Section: Methodsmentioning
confidence: 99%
“…Psychometric evaluation using Rasch analysis e839 despite the fact that medical educators have acknowledged its existence (Downing 2003;de Champlain 2010).…”
Section: Practice Pointsmentioning
confidence: 99%
“…De Champlain (De Champlain, 2010) defined that TIC standard depends on the intended use of test scores. If a test is a selection examination, it is important to measure a broad range of abilities with a similar level of precision or reliability out of fairness to candidates.…”
Section: Discussionmentioning
confidence: 99%