Observations of living organisms by citizen scientists that are reported to online portals are a valuable source of information. They are also a special kind of volunteered geographic information (VGI). VGI data have issues of completeness, which arise from biases caused by the opportunistic nature of the data collection process. We examined the completeness of bird species represented in citizen science observation data from eBird and iNaturalist in US National Parks (NPs). We used approaches for completeness estimation which were developed for data from OpenStreetMap, a crowdsourced map of the world. First, we used an extrinsic approach, comparing species lists from citizen science data with National Park Service lists. Second, we examined two intrinsic approaches using total observation numbers in NPs and the development of the number of new species being added to the data-set over time. Results from the extrinsic approach provided appropriate completeness estimations to evaluate the intrinsic approaches. We found that total observation numbers are a good estimator of species completeness of citizen science data from US NPs. There is also a close relationship between species completeness and the ratio of new species added to observation data vs. observation numbers in a given year.
People share data in different ways. Many of them contribute on a voluntary basis, while others are unaware of their contribution. They have differing intentions, collaborate in different ways, and they contribute data about differing aspects. Shared Data Sources have been explored individually in the literature, in particular OpenStreetMap and Twitter, and some types of Shared Data Sources have widely been studied, such as Volunteered Geographic Information (VGI), Ambient Geographic Information (AGI), and Public Participation Geographic Information Systems (PPGIS). A thorough and systematic discussion of Shared Data Sources in their entirety is, however, still missing. For the purpose of establishing such a discussion, we introduce in this article a schema consisting of a number of dimensions for characterizing socially produced, maintained, and used ‘Shared Data Sources,’ as well as corresponding visualization techniques. Both the schema and the visualization techniques allow for a common characterization in order to set individual data sources into context and to identify clusters of Shared Data Sources with common characteristics. Among others, this makes possible choosing suitable Shared Data Sources for a given task and gaining an understanding of how to interpret them by drawing parallels between several Shared Data Sources.
The last few years have seen the emergence of a large number of worldwide web portals where volunteers report and collect observations of plants and animals, share these reports with other users, and provide data for scientific research purposes along the way. Activities engaging citizens in the collection of scientific data or in solving scientific problems are collectively called citizen science. Data quality is a vital issue in this field. Currently, reports of species observations from citizen scientists are often validated manually by experts as a means of quality control. Experts evaluate the plausibility of a report based on their own expertise and experience. However, a rapid growth in the quantity of reports to be processed makes this approach increasingly less feasible, creating a need for methods supporting (semi)automatic validation of observation data. This aim is achieved primarily by analysing the spatial and temporal context of the data. Relevant context information can be provided by existing observation data, as well as by spatial data of environmental factors, or other spatio-temporal factors impacting the distribution of species, or the process of observation and contribution itself. It is very important that the 76 European Handbook of Crowdsourced Geographic Information specific properties of data emerging from citizen science origins are taken into account. These data are often not produced in a systematic way, resulting in (for instance) spatial and temporal incompleteness. Also, the data structure is not only determined by the natural spatio-temporal patterns of species distribution, but by other factors such as the behaviour of contributors or the design of the citizen science project that produced the data as well.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.