Abstract.-This contribution explores the problem of recognizing and measuring the universe of specimen-level data existing in natural history collections around the world, and in absence of a complete, world-wide census or register. Estimates of size seem necessary to plan for resource allocation for digitization or data capture, and may help to represent how many vouchered primary biodiversity data (in terms of collections, specimens or curatorial units) might remain to be mobilized. It further helps to set priorities, and assess certainties.Three general approaches are proposed for further development, and initial estimates are given. Probabilistic models involve crossing data from a set of biodiversity datasets, finding commonalities and estimating the likelihood of totally obscure data from the fraction of known data missing from specific datasets in the set. Distribution models aim to find the underlying distribution of collections' compositions, estimating the hidden sector of the distributions. Finally, case studies seek to compare digitized data from collections known to the world to the amount of data known to exist in the collection but not generally available or not digitized.Preliminary estimates of size range from 1.2 to 2.1 gigaunits (10 9 ) of which a mere 3% at most is currently web-accessible through GBIF's mobilization efforts. However, further data and analyses, along with other approaches relying more heavily on surveys, might change the picture and possibly help to narrow the estimate further. In particular, unknown collections not having emerged through literature are the major source of uncertainty.
Our world is in the midst of unprecedented change—climate shifts and sustained, widespread habitat degradation have led to dramatic declines in biodiversity rivaling historical extinction events. At the same time, new approaches to publishing and integrating previously disconnected data resources promise to help provide the evidence needed for more efficient and effective conservation and management. Stakeholders have invested considerable resources to contribute to online databases of species occurrences. However, estimates suggest that only 10% of biocollections are available in digital form. The biocollections community must therefore continue to promote digitization efforts, which in part requires demonstrating compelling applications of the data. Our overarching goal is therefore to determine trends in use of mobilized species occurrence data since 2010, as online systems have grown and now provide over one billion records. To do this, we characterized 501 papers that use openly accessible biodiversity databases. Our standardized tagging protocol was based on key topics of interest, including: database(s) used, taxa addressed, general uses of data, other data types linked to species occurrence data, and data quality issues addressed. We found that the most common uses of online biodiversity databases have been to estimate species distribution and richness, to outline data compilation and publication, and to assist in developing species checklists or describing new species. Only 69% of papers in our dataset addressed one or more aspects of data quality, which is low considering common errors and biases known to exist in opportunistic datasets. Globally, we find that biodiversity databases are still in the initial stages of data compilation. Novel and integrative applications are restricted to certain taxonomic groups and regions with higher numbers of quality records. Continued data digitization, publication, enhancement, and quality control efforts are necessary to make biodiversity science more efficient and relevant in our fast-changing environment.
-Freely available high quality, data on species occurrence and associated variables are needed in order to track changes in biodiversity. One of the main issues surrounding the provision of such data is that sources vary in quality, scope, and accuracy. Publishers of such data must face the challenge of maximizing quality, utility and breadth of data coverage, in order to make such data useful to users. With the Global Biodiversity Information Facility (GBIF), we recently conducted a content needs assessment survey to consolidate and synthesize major user needs regarding biodiversity data. We find a broad range of recommendations from the survey respondents, principally concerning issues such as data quality, bias, and coverage, and ease of access. We recommend a candidate set of actions for the GBIF that fall into three classes: 1) addressing data gaps, data volume, and data quality, 2) aggregating data types that are relatively new to GBIF, to support emerging new applications, and 3) promoting ease-of-use and providing incentives for wider use. Addressing the challenge of providing high quality primary biodiversity data potentially can serve the needs of national and international biodiversity initiatives. These include the "flexible framework" for addressing the new 2020 biodiversity targets of the Convention on Biological Diversity, the global biodiversity observation network (GEO BON) and the new Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES). Each of these presents opportunities for countries to define appropriate actions and corresponding data needs, with links from local to global scales.
We are in the midst of unprecedented change—climate shifts and sustained, widespread habitat degradation have led to dramatic declines in biodiversity rivaling historical extinction events. At the same time, new approaches to publishing and integrating previously disconnected data resources promise to help provide the evidence needed for more efficient and effective conservation and management. Stakeholders have invested considerable resources to contribute to online databases of species occurrences and genetic barcodes. However, estimates suggest that only 10% of biocollections are available in digital form. The biocollections community must therefore continue to promote digitization efforts, which in part requires demonstrating compelling applications of the data. Our overarching goal is therefore to determine trends in use of mobilized species occurrence data since 2010, as online systems have grown and now provide over one billion records. To do this, we characterized 501 papers that use openly accessible biodiversity databases. Our standardized tagging protocol was based on key topics of interest, including: database(s) used, taxa addressed, general uses of data, other data types linked to species occurrence data, and data quality issues addressed. We found that the most common uses of online biodiversity databases have been to estimate species distribution and richness, to outline data compilation and publication, and to assist in developing species checklists or describing new species. Only 69% of papers in our dataset addressed one or more aspects of data quality, which is low considering common errors and biases known to exist in opportunistic datasets. Globally, we find that biodiversity databases are still in the initial stages of data compilation. Novel and integrative applications are restricted to certain taxonomic groups and regions with higher numbers of quality records. Continued data digitization, publication, enhancement, and quality control efforts are necessary to make biodiversity science more efficient and relevant in our fast-changing world.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.