IRT modeling of forced choice 2 Abstract Multidimensional forced-choice formats can significantly reduce the impact of numerous response biases typically associated with rating scales. However, if scored with classical methodology these questionnaires produce ipsative data, which leads to distorted scale relationships and makes comparisons between individuals problematic. This research demonstrates how Item Response Theory (IRT) modeling may be applied to overcome these problems. A multidimensional IRT model based on Thurstone's framework for comparative data is introduced, which is suitable for use with any forced-choice questionnaire composed of items fitting the dominance response model, with any number of measured traits, and any block sizes (i.e. pairs, triplets, quads etc.). Thurstonian IRT models are normal ogive models with structured factor loadings, structured uniquenesses, and structured local dependencies.These models can be straightforwardly estimated using structural equation modeling (SEM) software Mplus. A number of simulation studies are performed to investigate how latent traits are recovered under various forced-choice designs, and to provide guidelines for optimal questionnaire design. An empirical application is given to illustrate how the model may be applied in practice. It is concluded that when the recommended design guidelines are met, scores estimated from forced-choice questionnaires with the proposed methodology reproduce the latent traits well.Keywords: forced-choice format, forced-choice questionnaires, ipsative data, comparative judgment, multidimensional IRT IRT modeling of forced choiceItem response modeling of forced-choice questionnairesThe most popular way of presenting questionnaire items is through rating scales (Likert scales), where participants are asked to rate a statement using some given categories (for example, ranging from "strongly disagree" to "strongly agree", or from "never" to "always", etc.). It is well-known that such format (single-stimulus format) can lead to various response biases, for instance because participants do not interpret the rating categories in the same way (Friedman & Amoo, 1999) between statements according to the extent these statements describe their preferences or behavior. When there are 2 statements in a block, respondents are simply asked to select the statement that better describes them. For blocks of 3, 4 or more statements, respondents may be asked to rank-order the statements, or to select one statement which is "most like me" and one which is "least like me" (i.e., to provide a partial ranking).Because it is impossible to endorse every item, the forced-choice format eliminates uniform biases such as acquiescence responding (Cheung & Chan, 2002), and can increase operational validity by reducing "halo" effects (Bartram, 2007). However, there are serious problems with the way the forced-choice questionnaires have been scored traditionally.IRT modeling of forced choiceTypically, rank orders of items in a block are reversed ...
The quality and level of evidence supporting each theme varied. We need further research on what factors predict carer QOL in dementia and how to measure it.
In multidimensional forced-choice (MFC) questionnaires, items measuring different attributes are presented in blocks, and participants have to rank-order the items within each block (fully or partially). Such comparative formats can reduce the impact of numerous response biases often affecting single-stimulus items (aka, rating or Likert scales). However, if scored with traditional methodology, MFC instruments produce ipsative data, whereby all individuals have a common total test score. Ipsative scoring distorts individual profiles (it is impossible to achieve all high or all low scale scores), construct validity (covariances between scales must sum to zero), criterion related validity (validity coefficients must sum to zero), and reliability estimates.We argue that these problems are caused by inadequate scoring of forced-choice items, and advocate the use of item response theory (IRT) models based on an appropriate response process for comparative data, such as Thurstone's Law of Comparative Judgment. We show that by applying Thurstonian IRT modeling (Brown & Maydeu-Olivares, 2011), even existing forcedchoice questionnaires with challenging features can be scored adequately and that the IRTestimated scores are free from the problems of ipsative data. Assessments of personality, social attitudes, interests, motivation, psychopathology and well-being largely rely on respondent-reported measures. Most such measures employ the socalled single-stimulus format, where respondents evaluate one question (or item) at a time, often in relation to a rating scale (i.e. Likert-type items). Because the respondents rate each item separately from other items, they make absolute judgments about the extent to which the item describes their personality, attitudes, etc. Simple to answer and score and therefore popular with test takers and test users, the single-stimulus format makes several assumptions about the respondents' rating behaviors that are often unrealistic. For instance, the use of rating scales relies on the assumption that respondents interpret category labels in the same way. This assumption is very rarely tested in practice, but research available on the issue suggests that interpretation and meaning of response categories vary from one respondent to another (Friedman & Amoo, 1999). Furthermore, individual response styles may vary (Van Herk, Poortinga & Verhallen, 2004) so that some respondents avoid extreme categories (central tendency responding), whereas others prefer them (extreme responding). Sometimes respondents tend to agree with both positive and negative statements as presented (acquiescence bias).Another common problem is getting respondents to differentiate between ratings they give to single-stimulus items. When rating another person's attributes or behavior (as in the 360-degree feedback), respondents commonly give either high or low ratings on all behaviors (halo/horn effect) depending on whether they judge the person to score high or low on a single important dimension. Typically, respon...
The present research addressed gaps in our current understanding of validity and quality of measurement provided by patient reported experience measures. We established the psychometric properties of a freely available experience of service questionnaire (ESQ), based on responses from 7,067 families of patients across 41 UK providers of child and adolescent mental health services, using the two-level latent trait modeling. Responses to the ESQ were subject to strong 'halo' effects, which were thought to represent the overall positive or negative affect towards one's treatment. Two strongly related constructs measured by the ESQ were interpreted as specific aspects of global satisfaction, namely satisfaction with care, and with environment. The Care construct was sensitive to differences between less satisfied patients, facilitating individual and service-level problem evaluation. The effects of nesting within service providers were strong, with parental reports being the most reliable source of data for the between-provider comparisons. We provide a scoring protocol for converting the hand-scored ESQ to the model-based population-referenced scores with supplied standard errors, which can be used for benchmarking services as well as individual evaluations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.