Summary
The method of factor analysis is widely used as an exploratory tool to reduce the dimensionality of multivariate data. The fact that the standard model is strictly applicable only when the manifest variables are scaled is a serious limitation in social science where the variables are often categorical. In this paper we aim to provide a theoretical framework within which methods for the factor analysis of categorical data can be devised and compared. Discussion is restricted to the case of ordered categories where the latent variables are continuous. It is argued that the choice of model should be made from a restricted set which includes two existing models as special cases. A new method is proposed together with a simple approximate technique of fitting for the one‐factor model. The paper concludes with an evaluation of existing methods and makes some suggestions about the direction which future research should take.
When a model is fitted to data in a 2p contingency table many cells are likely to have very small expected frequencies. This sparseness invalidates the usual approximation to the distribution of the chi-squared or log-likelihood tests of goodness of fit. We present a solution to this problem by proposing a test based on a comparison of the observed and expected frequencies of the second-order margins of the table. A chi2 approximation to the sampling distribution is provided using asymptotic moments. This can be straightforwardly calculated from the expected cell frequencies. The new test is applied to several previously published examples relating to the fitting of latent variable models, but its application is quite general.
Modern factor analysis is the outgrowth of Spearman's original "2-factor" model of intelligence, according to which a mental test score is regarded as the sum of a general factor and a specific factor. As early as 1914, Godfrey Thomson realized that the data did not require this interpretation and he demonstrated this by proposing what became known as his "bonds" model of intelligence. Van der Maas et al. (2006) have recently drawn attention to what they perceive as difficulties with both models and have proposed a 3rd model. Neither alternative requires the general factor that was at the core of Spearman's idea. Although Thomson's model has been largely forgotten, the authors show that it merits further consideration because it can compete, statistically and biologically, on equal terms with Spearman's model. In particular, they show that it is impossible to distinguish statistically between the 2 models. There are also lessons to be learnt from the way in which Thomson arrived at his model and from the subsequent debate between Spearman and Thomson. The extent to which the recent proposal by van der Maas et al. may offer any advantage over Spearman's and Thomson's models is unclear and requires further investigation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.