913 may also lead to dependence between species (phylogenetic structure) or populations of species (genetic structure) with more recent divergence will tend to be more similar than those which diverged longer ago (Harvey and Pagel 1991). While such underlying structures in the data are not fundamentally problematic for statistical analyses, they tend to create two undesirable outcomes. First, model error, as well as neglected processes and variables connected to these structures, often leads to dependence structures in the model residuals, which violates the critical assumption of independence present in many models and methods (Legendre and Fortin 1989, Miller et al. 2007). Second, because predictor variables are often correlated with underlying dependence structures (e.g. climate with space), models may use predic-tors to overfit the residual dependence structure and thereby remove it, partially or completely.
Aim Distribution modelling relates sparse data on species occurrence or abundance to environmental information to predict the population of a species at any point in space. Recently, the importance of spatial autocorrelation in distributions has been recognized. Spatial autocorrelation can be categorized as exogenous (stemming from autocorrelation in the underlying variables) or endogenous (stemming from activities of the organism itself, such as dispersal). Typically, one asks whether spatial models explain additional variability (endogenous) in comparison to a fully specified habitat model. We turned this question around and asked: can habitat models explain additional variation when spatial structure is accounted for in a fully specified spatially explicit model? The aim was to find out to what degree habitat models may be inadvertently capturing spatial structure rather than true explanatory mechanisms.Location We used data from 190 species of the North American Breeding Bird Survey covering the conterminous United States and southern Canada. MethodsWe built 13 different models on 190 bird species using regression trees. Our habitat-based models used climate and landcover variables as independent variables. We also used random variables and simulated ranges to validate our results. The two spatially explicit models included only geographical coordinates or a contagion term as independent variables. As another angle on the question of mechanism vs. spatial structure we pitted a model using related bird species as predictors against a model using randomly selected bird species. ResultsThe spatially explicit models outperformed the traditional habitat models and the random predictor species outperformed the related predictor species. In addition, environmental variables produced a substantial R 2 in predicting artificial ranges. Main conclusionsWe conclude that many explanatory variables with suitable spatial structure can work well in species distribution models. The predictive power of environmental variables is not necessarily mechanistic, and spatial interpolation can outperform environmental explanatory variables.
Distribution models are used to predict the likelihood of occurrence or abundance of a species at locations where census data are not available. An integral part of modelling is the testing of model performance. We compared different schemes and measures for testing model performance using 79 species from the North American Breeding Bird Survey. The four testing schemes we compared featured increasing independence between test and training data: resubstitution, random data hold‐out and two spatially segregated data hold‐out designs. The different testing measures also addressed different levels of information content in the dependent variable: regression R2 for absolute abundance, squared correlation coefficient r2 for relative abundance and AUC/Somer’s D for presence/absence. We found that higher levels of independence between test and training data lead to lower assessments of prediction accuracy. Even for data collected independently, spatial autocorrelation leads to dependence between random hold‐out test data and training data, and thus to inflated measures of model performance. While there is a general awareness of the importance of autocorrelation to model building and hypothesis testing, its consequences via violation of independence between training and testing data have not been addressed systematically and comprehensively before. Furthermore, increasing information content (from correctly classifying presence/absence, to predicting relative abundance, to predicting absolute abundance) leads to decreasing predictive performance. The current tests for presence/absence distribution models are typically overly optimistic because a) the test and training data are not independent and b) the correct classification of presence/absence has a relatively low information content and thus capability to address ecological and conservation questions compared to a prediction of abundance. Meaningful evaluation of model performance requires testing on spatially independent data, if the intended application of the model is to predict into new geographic or climatic space, which arguably is the case for most applications of distribution models.
In ecology, the true causal structure for a given problem is often not known, and several plausible models and thus model predictions exist. It has been claimed that using weighted averages of these models can reduce prediction error, as well as better reflect model selection uncertainty. These claims, however, are often demonstrated by isolated examples. Analysts must better understand under which conditions model averaging can improve predictions and their uncertainty estimates. Moreover, a large range of different model averaging methods exists, raising the question of how they differ in their behaviour and performance. Here, we review the mathematical foundations of model averaging along with the diversity of approaches available. We explain that the error in model‐averaged predictions depends on each model's predictive bias and variance, as well as the covariance in predictions between models, and uncertainty about model weights. We show that model averaging is particularly useful if the predictive error of contributing model predictions is dominated by variance, and if the covariance between models is low. For noisy data, which predominate in ecology, these conditions will often be met. Many different methods to derive averaging weights exist, from Bayesian over information‐theoretical to cross‐validation optimized and resampling approaches. A general recommendation is difficult, because the performance of methods is often context dependent. Importantly, estimating weights creates some additional uncertainty. As a result, estimated model weights may not always outperform arbitrary fixed weights, such as equal weights for all models. When averaging a set of models with many inadequate models, however, estimating model weights will typically be superior to equal weights. We also investigate the quality of the confidence intervals calculated for model‐averaged predictions, showing that they differ greatly in behaviour and seldom manage to achieve nominal coverage. Our overall recommendations stress the importance of non‐parametric methods such as cross‐validation for a reliable uncertainty quantification of model‐averaged predictions.
Species-specific climate responses within ecological communities may disrupt the synchrony of co-evolved mutualisms that are based on the shared timing of seasonal events, such as seed dispersal by ants (myrmecochory). The spring phenology of plants and ants coincides with marked changes in temperature, light and moisture. We investigate how these environmental drivers influence both seed release by early and late spring woodland herb species, and initiation of spring foraging by seed-dispersing ants. We pair experimental herbaceous transplants with artificial ant bait stations across north-and south-facing slopes at two contrasting geographic locations. This use of space enables robust identification of plant fruiting and ant foraging cues, and the use of transplants permits us to assess plasticity in plant phenology. We find that warming temperatures act as the primary phenological cue for plant fruiting and ant foraging. Moreover, the plasticity in plant response across locations, despite transplants being from the same source, suggests a high degree of portability in the seed-dispersing mutualism. However, we also find evidence for potential climate-driven facilitative failure that may lead to phenological asynchrony. Specifically, at the location where the early flowering species (Hepatica nobilis) is decreasing in abundance and distribution, we find far fewer seed-dispersing ants foraging during its fruit set than during that of the later flowering Hexastylis arifolia. Notably, the key seed disperser, Aphaenogaster rudis, fails to emerge during early fruit set at this location. At the second location, A. picea forages equally during early and late seed release. These results indicate that climate-driven changes might shift species-specific interactions in a plant-ant mutualism resulting in winners and losers within the myrmecochorous plant guild.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.