Summary Recent studies have demonstrated a need for increased rigour in building and evaluating ecological niche models (ENMs) based on presence‐only occurrence data. Two major goals are to balance goodness‐of‐fit with model complexity (e.g. by ‘tuning’ model settings) and to evaluate models with spatially independent data. These issues are especially critical for data sets suffering from sampling bias, and for studies that require transferring models across space or time (e.g. responses to climate change or spread of invasive species). Efficient implementation of procedures to accomplish these goals, however, requires automation. We developed ENMeval, an R package that: (i) creates data sets for k‐fold cross‐validation using one of several methods for partitioning occurrence data (including options for spatially independent partitions), (ii) builds a series of candidate models using Maxent with a variety of user‐defined settings and (iii) provides multiple evaluation metrics to aid in selecting optimal model settings. The six methods for partitioning data are n−1 jackknife, random k‐folds ( = bins), user‐specified folds and three methods of masked geographically structured folds. ENMeval quantifies six evaluation metrics: the area under the curve of the receiver‐operating characteristic plot for test localities (AUCTEST), the difference between training and testing AUC (AUCDIFF), two different threshold‐based omission rates for test localities and the Akaike information criterion corrected for small sample sizes (AICc). We demonstrate ENMeval by tuning model settings for eight tree species of the genus Coccoloba in Puerto Rico based on AICc. Evaluation metrics varied substantially across model settings, and models selected with AICc differed from default ones. In summary, ENMeval facilitates the production of better ENMs and should promote future methodological research on many outstanding issues.
1. Quantitative evaluations to optimize complexity have become standard for avoiding overfitting of ecological niche models (ENMs) that estimate species' potential geographic distributions. ENMeval was the first R package to make such evaluations (often termed model tuning) widely accessible for the Maxent algorithm.
Models of species ecological niches and geographic distributions now represent a widely used tool in ecology, evolution, and biogeography. However, the very common situation of species with few available occurrence localities presents major challenges for such modeling techniques, in particular regarding model complexity and evaluation. Here, we summarize the state of the field regarding these issues and provide a worked example using the technique Maxent for a small mammal endemic to Madagascar (the nesomyine rodent Eliurus majori). Two relevant model‐selection approaches exist in the literature (information criteria, specifically AICc; and performance predicting withheld data, via a jackknife), but AICc is not strictly applicable to machine‐learning algorithms like Maxent. We compare models chosen under each selection approach with those corresponding to Maxent default settings, both with and without spatial filtering of occurrence records to reduce the effects of sampling bias. Both selection approaches chose simpler models than those made using default settings. Furthermore, the approaches converged on a similar answer when sampling bias was taken into account, but differed markedly with the unfiltered occurrence data. Specifically, for that dataset, the models selected by AICc had substantially fewer parameters than those identified by performance on withheld data. Based on our knowledge of the study species, models chosen under both AICc and withheld‐data‐selection showed higher ecological plausibility when combined with spatial filtering. The results for this species intimate that AICc may consistently select models with fewer parameters and be more robust to sampling bias. To test these hypotheses and reach general conclusions, comprehensive research should be undertaken with a wide variety of real and simulated species. Meanwhile, we recommend that researchers assess the critical yet underappreciated issue of model complexity both via information criteria and performance on withheld data, comparing the results between the two approaches and taking into account ecological plausibility.
Released 4 years ago, the Wallace EcoMod application (R package wallace) provided an open-source and interactive platform for modeling species niches and distributions that served as a reproducible toolbox and educational resource. wallace harnesses R package tools documented in the literature and makes them available via a graphical user interface that runs analyses and returns code to document and reproduce them. Since its release, feedback from users and partners helped identify key areas for advancement, leading to the development of wallace 2. Following the vision of growth by community expansion, the core development team engaged with collaborators and undertook a major restructuring of the application to enable: simplified addition of custom modules to expand methodological options, analyses for multiple species in the same session, improved metadata features, new database connections, and saving/loading sessions. wallace 2 features nine new modules and added functionalities that facilitate data acquisition from climate-simulation, botanical and paleontological databases; custom data inputs; model metadata tracking; and citations for R packages used (to promote documentation and give credit to developers).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.