This article challenges Fixed Effects (FE) modeling as the ‘default’ for time-series-cross-sectional and panel data. Understanding different within and between effects is crucial when choosing modeling strategies. The downside of Random Effects (RE) modeling—correlated lower-level covariates and higher-level residuals—is omitted-variable bias, solvable with Mundlak's (1978a) formulation. Consequently, RE can provide everything that FE promises and more, as confirmed by Monte-Carlo simulations, which additionally show problems with Plümper and Troeger's FE Vector Decomposition method when data are unbalanced. As well as incorporating time-invariant variables, RE models are readily extendable, with random coefficients, cross-level interactions and complex variance functions. We argue not simply for technical solutions to endogeneity, but for the substantive importance of context/heterogeneity, modeled using RE. The implications extend beyond political science to all multilevel datasets. However, omitted variables could still bias estimated higher-level variable effects; as with any model, care is required in interpretation.
This paper assesses the options available to researchers analysing multilevel (including longitudinal) data, with the aim of supporting good methodological decision-making. Given the confusion in the literature about the key properties of fixed and random effects (FE and RE) models, we present these models' capabilities and limitations. We also discuss the within-between RE model, sometimes misleadingly labelled a 'hybrid' model, showing that it is the most general of the three, with all the strengths of the other two. As such, and because it allows for important extensions-notably random slopes-we argue it should be used (as a starting point at least) in all multilevel analyses. We develop the argument through simulations, evaluating how these models cope with some likely mis-specifications. These simulations reveal that (1) failing to include random slopes can generate anticonservative standard errors, and (2) assuming random intercepts are Normally distributed, when they are not, introduces only modest biases. These results strengthen the case for the use of, and need for, these models.
Many ecological- and individual-level analyses of voting behaviour use multiple regressions with a considerable number of independent variables but few discussions of their results pay any attention to the potential impact of inter-relationships among those independent variables—do they confound the regression parameters and hence their interpretation? Three empirical examples are deployed to address that question, with results which suggest considerable problems. Inter-relationships between variables, even if not approaching high collinearity, can have a substantial impact on regression model results and how they are interpreted in the light of prior expectations. Confounded relationships could be the norm and interpretations open to doubt, unless considerable care is applied in the analyses and an extended principal components method for doing that is introduced and exemplified.
A common application of multilevel models is to apportion the variance in the response according to the different levels of the data. Whereas partitioning variances is straightforward in models with a continuous response variable with a normal error distribution at each level, the extension of this partitioning to models with binary responses or to proportions or counts is less obvious. We describe methodology due to Goldstein and co-workers for apportioning variance that is attributable to higher levels in multilevel binomial logistic models. This partitioning they referred to as the variance partition coefficient. We consider extending the variance partition coefficient concept to data sets when the response is a proportion and where the binomial assumption may not be appropriate owing to overdispersion in the response variable. Using the literacy data from the 1991 Indian census we estimate simple and complex variance partition coefficients at multiple levels of geography in models with significant overdispersion and thereby establish the relative importance of different geographic levels that influence educational disparities in India. Copyright 2005 Royal Statistical Society.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.