Summary1. The use of both linear and generalized linear mixed-effects models (LMMs and GLMMs) has become popular not only in social and medical sciences, but also in biological sciences, especially in the field of ecology and evolution. Information criteria, such as Akaike Information Criterion (AIC), are usually presented as model comparison tools for mixed-effects models. 2. The presentation of 'variance explained' (R 2 ) as a relevant summarizing statistic of mixed-effects models, however, is rare, even though R 2 is routinely reported for linear models (LMs) and also generalized linear models (GLMs). R 2 has the extremely useful property of providing an absolute value for the goodness-of-fit of a model, which cannot be given by the information criteria. As a summary statistic that describes the amount of variance explained, R 2 can also be a quantity of biological interest. 3.One reason for the under-appreciation of R 2 for mixed-effects models lies in the fact that R 2 can be defined in a number of ways. Furthermore, most definitions of R 2 for mixed-effects have theoretical problems (e.g. decreased or negative R 2 values in larger models) and/or their use is hindered by practical difficulties (e.g. implementation). 4.Here, we make a case for the importance of reporting R 2 for mixed-effects models. We first provide the common definitions of R 2 for LMs and GLMs and discuss the key problems associated with calculating R 2 for mixed-effects models. We then recommend a general and simple method for calculating two types of R 2 (marginal and conditional R 2 ) for both LMMs and GLMMs, which are less susceptible to common problems. 5. This method is illustrated by examples and can be widely employed by researchers in any fields of research, regardless of software packages used for fitting mixed-effects models. The proposed method has the potential to facilitate the presentation of R 2 for a wide range of circumstances.
Repeatability (more precisely the common measure of repeatability, the intra-class correlation coefficient, ICC) is an important index for quantifying the accuracy of measurements and the constancy of phenotypes. It is the proportion of phenotypic variation that can be attributed to between-subject (or between-group) variation. As a consequence, the non-repeatable fraction of phenotypic variation is the sum of measurement error and phenotypic flexibility. There are several ways to estimate repeatability for Gaussian data, but there are no formal agreements on how repeatability should be calculated for non-Gaussian data (e.g. binary, proportion and count data). In addition to point estimates, appropriate uncertainty estimates (standard errors and confidence intervals) and statistical significance for repeatability estimates are required regardless of the types of data. We review the methods for calculating repeatability and the associated statistics for Gaussian and non-Gaussian data. For Gaussian data, we present three common approaches for estimating repeatability: correlation-based, analysis of variance (ANOVA)-based and linear mixed-effects model (LMM)-based methods, while for non-Gaussian data, we focus on generalised linear mixed-effects models (GLMM) that allow the estimation of repeatability on the original and on the underlying latent scale. We also address a number of methods for calculating standard errors, confidence intervals and statistical significance; the most accurate and recommended methods are parametric bootstrapping, randomisation tests and Bayesian approaches. We advocate the use of LMM- and GLMM-based approaches mainly because of the ease with which confounding variables can be controlled for. Furthermore, we compare two types of repeatability (ordinary repeatability and extrapolated repeatability) in relation to narrow-sense heritability. This review serves as a collection of guidelines and recommendations for biologists to calculate repeatability and heritability from both Gaussian and non-Gaussian data.
Summary1. Linear regression models are an important statistical tool in evolutionary and ecological studies. Unfortunately, these models often yield some uninterpretable estimates and hypothesis tests, especially when models contain interactions or polynomial terms. Furthermore, the standard errors for treatment groups, although often of interest for including in a publication, are not directly available in a standard linear model. 2. Centring and standardization of input variables are simple means to improve the interpretability of regression coefficients. Further, refitting the model with a slightly modified model structure allows extracting the appropriate standard errors for treatment groups directly from the model. 3. Centring will make main effects biologically interpretable even when involved in interactions and thus avoids the potential misinterpretation of main effects. This also applies to the estimation of linear effects in the presence of polynomials. Categorical input variables can also be centred and this sometimes assists interpretation. 4. Standardization (z-transformation) of input variables results in the estimation of standardized slopes or standardized partial regression coefficients. Standardized slopes are comparable in magnitude within models as well as between studies. They have some advantages over partial correlation coefficients and are often the more interesting standardized effect size. 5. The thoughtful removal of intercepts or main effects allows extracting treatment means or treatment slopes and their appropriate standard errors directly from a linear model. This provides a simple alternative to the more complicated calculation of standard errors from contrasts and main effects. 6. The simple methods presented here put the focus on parameter estimation (point estimates as well as confidence intervals) rather than on significance thresholds. They allow fitting complex, but meaningful models that can be concisely presented and interpreted. The presented methods can also be applied to generalised linear models (GLM) and linear mixed models.
The coefficient of determination R2 quantifies the proportion of variance explained by a statistical model and is an important summary statistic of biological interest. However, estimating R2 for generalized linear mixed models (GLMMs) remains challenging. We have previously introduced a version of R2 that we called for Poisson and binomial GLMMs, but not for other distributional families. Similarly, we earlier discussed how to estimate intra-class correlation coefficients (ICCs) using Poisson and binomial GLMMs. In this paper, we generalize our methods to all other non-Gaussian distributions, in particular to negative binomial and gamma distributions that are commonly used for modelling biological data. While expanding our approach, we highlight two useful concepts for biologists, Jensen's inequality and the delta method, both of which help us in understanding the properties of GLMMs. Jensen's inequality has important implications for biologically meaningful interpretation of GLMMs, whereas the delta method allows a general derivation of variance associated with non-Gaussian distributions. We also discuss some special considerations for binomial GLMMs with binary or proportion data. We illustrate the implementation of our extension by worked examples from the field of ecology and evolution in the R environment. However, our method can be used across disciplines and regardless of statistical environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.