Collinearity refers to the non independence of predictor variables, usually in a regression‐type analysis. It is a common feature of any descriptive ecological data set and can be a problem for parameter estimation because it inflates the variance of regression parameters and hence potentially leads to the wrong identification of relevant predictors in a statistical model. Collinearity is a severe problem when a model is trained on data from one region or time, and predicted to another with a different or unknown structure of collinearity. To demonstrate the reach of the problem of collinearity in ecology, we show how relationships among predictors differ between biomes, change over spatial scales and through time. Across disciplines, different approaches to addressing collinearity problems have been developed, ranging from clustering of predictors, threshold‐based pre‐selection, through latent variable methods, to shrinkage and regularisation. Using simulated data with five predictor‐response relationships of increasing complexity and eight levels of collinearity we compared ways to address collinearity with standard multiple regression and machine‐learning approaches. We assessed the performance of each approach by testing its impact on prediction to new data. In the extreme, we tested whether the methods were able to identify the true underlying relationship in a training dataset with strong collinearity by evaluating its performance on a test dataset without any collinearity. We found that methods specifically designed for collinearity, such as latent variable methods and tree based models, did not outperform the traditional GLM and threshold‐based pre‐selection. Our results highlight the value of GLM in combination with penalised methods (particularly ridge) and threshold‐based pre‐selection when omitted variables are considered in the final interpretation. However, all approaches tested yielded degraded predictions under change in collinearity structure and the ‘folk lore’‐thresholds of correlation coefficients between predictor variables of |r| >0.7 was an appropriate indicator for when collinearity begins to severely distort model estimation and subsequent prediction. The use of ecological understanding of the system in pre‐analysis variable selection and the choice of the least sensitive statistical approaches reduce the problems of collinearity, but cannot ultimately solve them.
Predictive models are central to many scientific disciplines and vital for informing management in a rapidly changing world. However, limited understanding of the accuracy and precision of models transferred to novel conditions (their 'transferability') undermines confidence in their predictions. Here, 50 experts identified priority knowledge gaps which, if filled, will most improve model transfers. These are summarized into six technical and six fundamental challenges, which underlie the combined need to intensify research on the determinants of ecological predictability, including species traits and data quality, and develop best practices for transferring models. Of high importance is the identification of a widely applicable set of transferability metrics, with appropriate tools to quantify the sources and impacts of prediction uncertainty under novel conditions. Predicting the UnknownPredictions facilitate the formulation of quantitative, testable hypotheses that can be refined and validated empirically [1]. Predictive models have thus become ubiquitous in numerous scientific disciplines, including ecology [2], where they provide means for mapping species distributions, explaining population trends, or quantifying the risks of biological invasions and disease outbreaks (e.g., [3,4]). The practical value of predictive models in supporting policy and decision making has therefore grown rapidly (Box 1) [5]. With that has come an increasing desire to predict (see Glossary) the state of ecological features (e.g., species, habitats) and our likely impacts upon them [5], prompting a shift from explanatory models to anticipatory predictions [2]. However, in many situations, severe data deficiencies preclude the development of specific models, and the collection of new data can be prohibitively costly or simply impossible [6]. It is in this context that interest in transferable models (i.e., those that can be legitimately projected beyond the spatial and temporal bounds of their underlying data [7]) has grown.Transferred models must balance the tradeoff between estimation and prediction bias and variance (homogenization versus nontransferability, sensu [8]). Ultimately, models that can Highlights Models transferred to novel conditions could provide predictions in data-poor scenarios, contributing to more informed management decisions.The determinants of ecological predictability are, however, still insufficiently understood.Predictions from transferred ecological models are affected by species' traits, sampling biases, biotic interactions, nonstationarity, and the degree of environmental dissimilarity between reference and target systems.We synthesize six technical and six fundamental challenges that, if resolved, will catalyze practical and conceptual advances in model transfers.We propose that the most immediate obstacle to improving understanding lies in the absence of a widely applicable set of metrics for assessing transferability, and that encouraging the development of models grounded in well-established mech...
Species distribution models (SDMs) constitute the most common class of models across ecology, evolution and conservation. The advent of ready‐to‐use software packages and increasing availability of digital geoinformation have considerably assisted the application of SDMs in the past decade, greatly enabling their broader use for informing conservation and management, and for quantifying impacts from global change. However, models must be fit for purpose, with all important aspects of their development and applications properly considered. Despite the widespread use of SDMs, standardisation and documentation of modelling protocols remain limited, which makes it hard to assess whether development steps are appropriate for end use. To address these issues, we propose a standard protocol for reporting SDMs, with an emphasis on describing how a study's objective is achieved through a series of modeling decisions. We call this the ODMAP (Overview, Data, Model, Assessment and Prediction) protocol, as its components reflect the main steps involved in building SDMs and other empirically‐based biodiversity models. The ODMAP protocol serves two main purposes. First, it provides a checklist for authors, detailing key steps for model building and analyses, and thus represents a quick guide and generic workflow for modern SDMs. Second, it introduces a structured format for documenting and communicating the models, ensuring transparency and reproducibility, facilitating peer review and expert evaluation of model quality, as well as meta‐analyses. We detail all elements of ODMAP, and explain how it can be used for different model objectives and applications, and how it complements efforts to store associated metadata and define modelling standards. We illustrate its utility by revisiting nine previously published case studies, and provide an interactive web‐based application to facilitate its use. We plan to advance ODMAP by encouraging its further refinement and adoption by the scientific community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.