Question: How can nearest-neighbour (NN) imputation be used to develop maps of multiple species and plant communities?Location: Western and central Oregon, USA, but methods are applicable anywhere. Methods:We demonstrate NN imputation by mapping woody plant communities for 4 100 000 km 2 of diverse forests and woodlands. Species abundances on $25 000 plots were related to spatial predictors (rasters) describing climate, topography, soil and geographic location using constrained ordination (CCA). Species data from the nearest plot in multi-dimensional CCA space were imputed to each map pixel. Maps of multiple individual species and community types were constructed from the single imputed surface. We computed a variety of diagnostics to characterize different qualities of the imputed (mapped) community data.Results: Community composition gradients were strongly associated with climate and elevation, and less so with topography and soil. Accuracy of the imputation model for presence/absence of 150 species varied widely (kappa 0.00 to 0.80). Omission error rates were higher than commission rates due to low species prevalence, and areal representation of species was only slightly inflated. A map of 78 community types was 41% correct and 78% fuzzy correct. Errors of omission and commission were balanced, and areal representation of both rare and abundant communities was accurate. Map accuracy may be lower for some species than with other methods, but areal representation of species and communities across large landscapes is preserved. Because imputed vegetation surfaces are developed for all species simultaneously, map units contain suites of species known to co-occur in nature. Maps of individual species, and of community types derived from them, will be internally consistent at map locations.Conclusions: NN imputation is a useful modelling approach where maps of multiple species and plant communities are needed, such as in natural resource management and conservation planning or models that project landscape change under alternative disturbance or climate scenarios. More research is needed to evaluate other ordination methods for NN imputation of plant communities.
Aim Landscape management and conservation planning require maps of vegetation composition and structure over large regions. Species distribution models (SDMs) are often used for individual species, but projects mapping multiple species are rarer. We compare maps of plant community composition assembled by stacking results from many SDMs with multivariate maps constructed using nearest‐neighbor imputation. Location Western Cascades ecoregion, Oregon and California, USA. Methods We mapped distributions and abundances of 28 tree species over 4,007,110 ha at 30‐m resolution using three approaches: SDMs using machine learning (random forest) to yield: (1) binary (RF_Bin); (2) basal area (abundance; RF_Abund) predictions; and (3) multi‐species basal area predictions using a nearest‐neighbor imputation variant based on random forest (RF_NN). We evaluated accuracy of binary predictions for all models, compared area mapped with plot‐based areal estimates, assessed species abundance at two spatial scales and evaluated communities for species richness, problematic compositional errors and overall community composition. Results RF_Bin yielded the strongest binary predictions (median True Skill Statistics; RF_Bin: 0.57, RF_NN: 0.38, RF_Abund: 0.27). Plot‐scale predictions of abundance were poor for RF_Abund and RF_NN (median Agreement Coefficient (AC): −1.77 and −2.28), but strong when summarized over 50‐km radius tessellated hexagons (median AC for both: 0.79). RF_Abund's strength with abundance and weakness with binary predictions stems from predicting small values instead of zeros. The number of zero value predictions from RF_NN was closest to counts of zeros in the plot data. Correspondingly, RF_NN's map‐based species area estimates closely matched plot‐based area estimates. RF_NN also performed best for community‐level accuracy metrics. Conclusions RF_NN was the best technique for building a broad‐scale map of diversity and composition because the modelling framework maintained inter‐species relationships from the input plot data. Re‐assembling communities from single variable maps often yielded unrealistic communities. Although RF_NN rarely excelled at single species predictions of presence or abundance, it was often adequate to many (but not all) applications in both dimensions. We discuss our results in the context of map utility for applications in the fields of ecology, conservation and natural resource management planning. We highlight how RF_NN is well‐suited for mapping current but not future vegetation.
Conservation planning for wildlife species requires mapping and assessment of habitat suitability across broad areas, often relying on a diverse suite, or stack, of geospatial data presenting multidimensional controls on a species. Stacks of univariate, independently developed vegetation layers may not represent relationships between each variable that can be characterized by multivariate modeling techniques, leading to inaccurate inferences on the distribution of suitable habitat. In this paper, we examine the role of variable combining in mapping multiple dimensions of greater sage-grouse (Centrocercus urophasianus, GRSG) habitat as a basis for GRSG conservation in the great basin ecoregion within southeastern Oregon. We compare two modeling approaches: a univariate random forest regression model (RF regression) and a multivariate random forest nearest neighbor (RFNN) imputation model , across an array of variables. These include five GRSG habitat descriptor variables: percent cover of trees, juniper, sagebrush, and GRSG food forbs, and the proportion of grasses that are exotic annuals. We also model species distributions of 51 common species in the sage steppe and combine these predictions to estimate alpha diversity. Our results show that RF regression and RFNN can yield univariate predictions with similar performance, but RF regression predictions tend to contain slightly more bias at broader spatial scales. Stacking univariate predictions from RF regression yields covariance errors that manifest as logical errors (juniper cover > tree cover), biases in estimates of GRSG habitat area, and biases in estimates of alpha diversity. Combining variables from the RFNN model does not introduce covariance errors. We conclude that multivariate modeling approaches are better suited to map multidimensional habitat niches at broader spatial scales, and also better suited to provide information for defining multivariable adaptive management triggers at the population level or above.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.