Motivation The availability of user‐friendly, high‐resolution global environmental datasets is crucial for bioclimatic modelling. For terrestrial environments, WorldClim has served this purpose since 2005, but equivalent marine data only became available in 2012, with pioneer initiatives like Bio‐ORACLE providing data layers for several ecologically relevant variables. Currently, the available marine data packages have not yet been updated to the most recent Intergovernmental Panel on Climate Change (IPCC) predictions nor to present times, and are mostly restricted to the top surface layer of the oceans, precluding the modelling of a large fraction of the benthic diversity that inhabits deeper habitats. To address this gap, we present a significant update of Bio‐ORACLE for new future climate scenarios, present‐day conditions and benthic layers (near sea bottom). The reliability of data layers was assessed using a cross‐validation framework against in situ quality‐controlled data. This test showed a generally good agreement between our data layers and the global climatic patterns. We also provide a package of functions in the R software environment (sdmpredictors) to facilitate listing, extraction and management of data layers and allow easy integration with the available pipelines for bioclimatic modelling. Main types of variable contained Surface and benthic layers for water temperature, salinity, nutrients, chlorophyll, sea ice, current velocity, phytoplankton, primary productivity, iron and light at bottom. Spatial location and grain Global at 5 arcmin (c. 0.08° or 9.2 km at the equator). Time period and grain Present (2000–2014) and future (2040–2050 and 2090–2100) environmental conditions based on monthly averages. Major taxa and level of measurement Marine biodiversity associated with sea surface and epibenthic habitats. Software format ASCII and TIFF grid formats for geographical information systems and a package of functions developed for R software.
Aim Ideally, datasets for species distribution modelling (SDM) contain evenly sampled records covering the entire distribution of the species, confirmed absences and auxiliary ecophysiological data allowing informed decisions on relevant predictors. Unfortunately, these criteria are rarely met for marine organisms for which distributions are too often only scantly characterized and absences generally not recorded. Here, we investigate predictor relevance as a function of modelling algorithms and settings for a global dataset of marine species. Location Global marine. Methods We selected well‐studied and identifiable species from all major marine taxonomic groups. Distribution records were compiled from public sources (e.g., OBIS, GBIF, Reef Life Survey) and linked to environmental data from Bio‐ORACLE and MARSPEC. Using this dataset, predictor relevance was analysed under different variations of modelling algorithms, numbers of predictor variables, cross‐validation strategies, sampling bias mitigation methods, evaluation methods and ranking methods. SDMs for all combinations of predictors from eight correlation groups were fitted and ranked, from which the top five predictors were selected as the most relevant. Results We collected two million distribution records from 514 species across 18 phyla. Mean sea surface temperature and calcite are, respectively, the most relevant and irrelevant predictors. A less clear pattern was derived from the other predictors. The biggest differences in predictor relevance were induced by varying the number of predictors, the modelling algorithm and the sample selection bias correction. The distribution data and associated environmental data are made available through the R package marinespeed and at http://marinespeed.org. Main conclusions While temperature is a relevant predictor of global marine species distributions, considerable variation in predictor relevance is linked to the SDM set‐up. We promote the usage of a standardized benchmark dataset (MarineSPEED) for methodological SDM studies.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.