2023
DOI: 10.1111/ecog.06500
|View full text |Cite
|
Sign up to set email alerts
|

Modeling the rarest of the rare: a comparison between multi‐species distribution models, ensembles of small models, and single‐species models at extremely low sample sizes

Abstract: Species distribution models are useful for estimating the distribution and environmental preferences of rare species, but these same species are challenging to model on account of sparse data. We contrast a traditional single-species approach (generalized linear models, GLMs) with two promising frameworks for modeling rare species: ensembles of small models (ESMs), which average across simple models; and multispecies distribution models (MSDMs), which allow rarer species to benefit from statistical 'borrowing … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
20
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(20 citation statements)
references
References 72 publications
0
20
0
Order By: Relevance
“…Case–control sampling in the context of a multivariate (i.e., multispecies) problem is not trivial and remains an ongoing area of research (Tarekegn et al, 2021). Our joint model's structure likely provides some benefits for rare and infrequently detected species (Ovaskainen & Soininen, 2011, though see Erickson & Smith, 2023), but additional work should be done to quantify the extent to which imbalanced data influence model performance and resulting inferences.…”
Section: Discussionmentioning
confidence: 99%
“…Case–control sampling in the context of a multivariate (i.e., multispecies) problem is not trivial and remains an ongoing area of research (Tarekegn et al, 2021). Our joint model's structure likely provides some benefits for rare and infrequently detected species (Ovaskainen & Soininen, 2011, though see Erickson & Smith, 2023), but additional work should be done to quantify the extent to which imbalanced data influence model performance and resulting inferences.…”
Section: Discussionmentioning
confidence: 99%
“…Essentially, SDM accuracy is enhanced with an increased amount of data (Fig. 3) [29,30]. In our analysis, we maintained a fixed proportion of 50% for Biome data within the Biome +Traditional dataset, which in turn restricted the amount of available Biome +Traditional data.…”
Section: Discussionmentioning
confidence: 99%
“…Independent models were generated using the selected CanaryClim and CHELSA variables at 100-m and 1-km resolution, respectively, together with the selected topographic variables as predictors. As we had between 10 and 100 occurrences per species, we employed 'ensemble of small models' (ESMs), which have been specifically developed for small data sets (Breiner et al, 2015(Breiner et al, , 2018Erickson & Smith, 2023;Lomba et al, 2010); ESMs have been successfully implemented in bryophytes in recent studies (Cerrejón et al, 2022;Collart, Hedenäs et al, 2021). Ensemble of small models consist in generating bivariate models with all possible pairs of predictors, which are subsequently combined into an ensemble.…”
Section: Modelling Approach: Ensemble Of Small Modelsmentioning
confidence: 99%