Objective Patient-reported outcomes (PROs) are essential when evaluating many new treatments in health care, yet current measures have been limited by a lack of precision, standardization and comparability of scores across studies and diseases. The Patient-Reported Outcomes Measurement Information System (PROMIS™) provides item banks that offer the potential for PRO measurement that is efficient (minimizes item number without compromising reliability) flexible (enables optional use of interchangeable items), and precise (has minimal error in estimate) measurement of commonly-studied PROs. We report results from the first large-scale testing of PROMIS items. Study Design and Setting Fourteen item pools were tested in the U.S. general population and clinical groups using an online panel and clinic recruitment. A scale-setting sub-sample was created reflecting demographics proportional to the 2000 U.S. census. Results Using item response theory (graded response model), 11 item banks were calibrated on a sample of 21,133, measuring components of self-reported physical, mental and social health, along with a 10-item global health scale. Short forms from each bank were developed and compared to the overall bank as well as with other well-validated and widely accepted (“legacy”) measures. All item banks demonstrated good reliability across the majority of the score distributions. Construct validity was supported by moderate to strong correlations with legacy measures. Conclusion PROMIS item banks and their short forms provide evidence they are reliable and precise measures of generic symptoms and functional reports comparable to legacy instruments. Further testing will continue to validate and test PROMIS items and banks in diverse clinical populations.
Summarized are key analytic issues; recommendations are provided for future evaluations of item banks in HRQOL assessment.
This paper describes the psychometric properties of the PROMIS Pain Interference (PROMIS-PI) bank. An initial candidate item pool (n=644) was developed and evaluated based on review of existing instruments, interviews with patients, and consultation with pain experts. From this pool, a candidate item bank of 56 items was selected and responses to the items were collected from large community and clinical samples. A total of 14,848 participants responded to all or a subset of candidate items. The responses were calibrated using an item response theory (IRT) model. A final 41-item bank was evaluated with respect to IRT assumptions, model fit, differential item function (DIF), precision, and construct and concurrent validity. Items of the revised bank had good fit to the IRT model (CFI and NNFI/TLI ranged from 0.974 to 0.997), and the data were strongly unidimensional (e.g., ratio of first and second eigenvalue = 35). Nine items exhibited statistically significant DIF. However, adjusting for DIF had little practical impact on score estimates and the items were retained without modifying scoring. Scores provided substantial information across levels of pain; for scores in the T-score range 50-80, the reliability was equivalent to 0.96 to 0.99. Patterns of correlations with other health outcomes supported the construct validity of the item bank. The scores discriminated among persons with different numbers of chronic conditions, disabling conditions, levels of self-reported health, and pain intensity (p< 0.0001). The results indicated that the PROMIS-PI items constitute a psychometrically sound bank. Computerized adaptive testing and short forms are available.
Magnitude differences in scores on a measure of quality of life that correspond to differences in function or clinical course are called clinically important differences (CIDs). Anchor-based and distribution-based methods were used to provide ranges of CIDs for five targeted scale scores of the Functional Assessment of Cancer Therapy-Anemia (FACT-An) questionnaire. Three samples of cancer patients were used: Sample 1 included 50 patients participating in a validation study of the FACT-An; Sample 2 included 131 patients participating in a longitudinal study of chemotherapy-induced fatigue; sample 3 included 2,402 patients enrolled in a community-based clinical trial evaluating the effectiveness and safety of a treatment for anemia. Three clinical indicators (hemoglobin level; performance status; response to treatment) were used to determine anchor-based differences. One-half of the standard deviation and 1 standard error of measurement were used as distribution-based criteria. Analyses supported the following whole number estimates of a minimal CID for these five targeted scores: Fatigue Scale = 3.0; FACT-G total score = 4.0; FACT-An total score = 7.0; Trial Outcome Index-Fatigue = 5.0; and Trial Outcome Index-Anemia = 6.0. These estimates provide a basis for sample size estimation when planning for a clinical trial or other longitudinal study, when the purpose is to ensure detection of meaningful change over time. They can also be used in conjunction with more traditional clinical markers to assist investigators in determining treatment efficacy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.