The purpose of this study was to evaluate the internal consistency reliability of the General Teacher Test assuming clustered and non-clustered data using commercial software (Mplus). Participants were 2,000 testees who were selected using random sampling from a larger pool of examinees (more than 65k). The measure involved four factors, namely: (a) planning for learning, (b) promoting learning, (c) supporting learning, and (d) professional responsibilities, and was hypothesized to comprise a unidimensional instrument assessing generalized skills and competencies. Intra-class correlation coefficients and variance ratio statistics suggested the need to incorporate a clustering variable (i.e., university) when evaluating the factor structure of the measure. Results indicated that single level reliability estimation significantly overestimated the reliability observed across persons and underestimated the reliability at the clustering variable (university). One level reliability was also, at times, lower than the lowest acceptable levels leading to a conclusion of unreliability whereas multilevel reliability was low at the between person level but excellent at the between university level. It was concluded ignoring nesting is associated with distorted and erroneous estimates of internal consistency reliability of an ability measure and the use of MCFA is imperative to account for dependencies between levels of analyses.
This article provides an empirical illustration of the utility of the bifactor method for unidimensionality assessment when other methods disagree. Specifically, we used two popular methods for unidimensionality assessment: (a) evaluating the model fit of a one-factor model using Mplus, and (b) DIMTEST to show that different unidimensionality methods may lead to different results, and argued that in such cases the bifactor method can be particularly useful. Those procedures were applied to English Placement Test (EPT), a high-stakes English proficiency test in Saudi Arabia, to determine whether EPT is unidimensional so that a unidimensional item response theory (IRT) model can be used for calibration and scoring. We concluded that despite the inconsistency between the one-factor model approach and DIMTEST, the bifactor method indicates that, for practical purposes, unidimensionality assumption holds for EPT.
The purpose of the present study was to extend the model of measurement invariance by simultaneously estimating invariance across multiple populations in the dichotomous instrument case using multi-group confirmatory factor analytic and multiple indicator multiple causes (MIMIC) methodologies. Using the Arabic version of the General Aptitude Test (GAT), invariance was tested at the configural, metric, and scalar levels. Results indicated that the hybrid model that incorporates both the multi-group case and the MIMIC model provide a viable alternative to the measurement of invariance between variables when they interact. Metric and scalar invariance were supported for all of GAT’s subscales with the exception of Word Meaning for which lack of invariance was likely caused by model misspecification. Subtle effects were observed in favor of public school testees, but they did not exceed significance levels.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.