Three distinctive methods of assessing measurement equivalence of ordinal items, namely, confirmatory factor analysis, differential item functioning using item response theory, and latent class factor analysis, make different modeling assumptions and adopt different procedures. Simulation data are used to compare the performance of these three approaches in detecting the sources of measurement inequivalence. For this purpose, the authors simulated Likert-type data using two nonlinear models, one with categorical and one with continuous latent variables. Inequivalence was set up in the slope parameters (loadings) as well as in the item intercept parameters in a form resembling agreement and extreme response styles. Results indicate that the item response theory and latent class factor models can relatively accurately detect and locate inequivalence in the intercept and slope parameters both at the scale and the item levels. Confirmatory factor analysis performs well when inequivalence is located in the slope parameters but wrongfully indicates inequivalence in the slope parameters when inequivalence is located in the intercept parameters. Influences of sample size, number of inequivalent items in a scale, and model fit criteria on the performance of the three methods are also analyzed.
This document, as well as any data and map included herein, are without prejudice to the status of or sovereignty over any territory, to the delimitation of international frontiers and boundaries and to the name of any territory, city or area.
In cross-cultural comparative studies it is essential to establish equivalent measurement of relevant constructs across cultures. If this equivalence is not confirmed it is difficult if not impossible to make meaningful comparison of results across countries. This work presents concept of measurement equivalence, its relationship with other related concepts, different equivalence levels and causes of inequivalence in cross-cultural research. It also reviews three main approaches to the analysis of measurement equivalence - multigroup confirmatory factor analysis, differential item functioning, and multigroup latent class analysis - with special emphasis on their similarities and differences, as well as comparative advantages.
General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.-Users may download and print one copy of any publication from the public portal for the purpose of private study or research -You may not further distribute the material or use it for any profit-making activity or commercial gain -You may freely distribute the URL identifying the publication in the public portal Take down policyIf you believe that this document breaches copyright, please contact us providing details, and we will remove access to the work immediately and investigate your claim. AbstractThe Program for International Student Assessment (PISA) is a large-scale cross-national study that measures academic competencies of 15-year-old students in mathematics, reading, and science from more than 50 countries/economies around the world. PISA results are usually aggregated and presented in so-called "league tables," in which countries are compared and ranked in each of the three scales. However, to compare results obtained from different groups/countries, one must first be sure that the tests measure the same competencies in all cultures. In this paper, this is tested by examining the level of measurement equivalence in the 2009 PISA data set using an item response theory approach (IRT) and analyzing differential item functioning (DIF). Measurement in-equivalence was found in the form of uniform DIF. In-equivalence occurred in a majority of test questions in all three scales researched and is, on average, of moderate size. It varies considerably both across items and across countries. When this uniform DIF is accounted for in the in-equivalent model, the resulting country scores change considerably in the cases of the "Mathematics," "Science," and especially, "Reading" scale. These changes tend to occur simultaneously and in the same direction in groups of regional countries. The most affected seems to be Southeast Asian countries/territories whose scores, although among the highest in the initial, homogeneous model, additionally increase when accounting for in-equivalence in the scales.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.