ArticleOrdered rating scales are widely used in the assessment of personality, attitudes, and other latent variables. For example, in the Revised NEO Personality Inventory (NEO-PI-R; Costa & McCrae, 1992), participants respond on a 5-point Likert-type scale with the options strongly disagree, disagree, neutral, agree, and strongly agree. Another example for an ordered rating scale is the response categories never, sometimes, often, and always. With ordered rating scales the underlying assumption is that endorsing a higher response category implies a higher trait level.Models in the framework of item response theory (IRT) such as the Rasch model (Rasch, 1960) for dichotomous items or the partial credit model (PCM;Masters, 1982) for polytomous items define the probability of a response in a certain category as a function of the respondent's latent trait level and item characteristics. In the Rasch model, only one item parameter is estimated, namely, the difficulty of the item. In modeling responses from ordered rating scales according to the PCM, the "difficulty" of each of the response categories needs to be taken into account. This is done using threshold parameters that are defined as the point on the latent trait continuum where the response probability for two adjacent response categories is equal. Thus, for a 5-point scale, we have four threshold parameters. To illustrate, Figure 1A shows category probability curves for an item with a 5-point scale. These curves represent the probability of endorsing each of the five categories conditional on the latent trait level depicted on the x-axis. The four thresholds are included as perpendicular lines. Threshold 1, which is the threshold between the categories strongly disagree and disagree, is located at about −2.9 logits. Thus, respondents with trait levels of exactly −2.9 have equal probabilities of endorsing strongly disagree and disagree. Respondents with trait levels below −2.9 have the highest probability of endorsing strongly disagree, whereas respondents with trait levels above −2.9 to −0.8 (where threshold 2 is located) have the highest probability of endorsing disagree. This definition and interpretation of threshold parameters also holds for extensions of the PCM to two-parameter logistic models such as the generalized PCM (GPCM; Muraki, 1992). In the Rasch model and the PCM, it is assumed that all items have the same item discrimination. Two-parameter logistic models for dichotomous or polytomous items (e.g., GPCM) relax this assumption and estimate a discrimination parameter for each item in addition to the difficulty (threshold) parameters. While most of the following discussion will focus on the one-parameter PCM, some of the analyses will also be reported for the GPCM to show that the results generalize to models with more than one parameter.
AbstractWhen questionnaire data with an ordered polytomous response format are analyzed in the framework of item response theory using the partial credit model or the generalized partial credit model, revers...