Volunteer geographical information (VGI), either in the context of citizen science or the mining of social media, has proven to be useful in various domains including natural hazards, health status, disease epidemics, and biological monitoring. Nonetheless, the variable or unknown data quality due to crowdsourcing settings are still an obstacle for fully integrating these data sources in environmental studies and potentially in policy making. The data curation process, in which a quality assurance (QA) is needed, is often driven by the direct usability of the data collected within a data conflation process or data fusion (DCDF), combining the crowdsourced data into one view, using potentially other data sources as well. Looking at current practices in VGI data quality and using two examples, namely land cover validation and inundation extent estimation, this paper discusses the close links between QA and DCDF. It aims to help in deciding whether a disentanglement can be possible, whether beneficial or not, in understanding the data curation process with respect to its methodology for future usage of crowdsourced data. Analysing situations throughout the data curation process where and when entanglement between QA and DCDF occur, the paper explores the various facets of VGI data capture, as well as data quality assessment and purposes. Far from rejecting the usability ISO quality criterion, the paper advocates for a decoupling of the QA process and the DCDF step as much as possible while still integrating them within an approach analogous to a Bayesian paradigm.
ABSTRACT:Volunteer geographical information (VGI) either in the context of citizen science, active crowdsourcing and even passive crowdsourcing has been proven useful in various societal domains such as natural hazards, health status, disease epidemic and biological monitoring. Nonetheless, the variable degrees or unknown quality due to the crowdsourcing settings are still an obstacle for fully integrating these data sources in environmental studies and potentially in policy making. The data curation process in which a quality assurance (QA) is needed is often driven by the direct usability of the data collected within a data conflation process or data fusion (DCDF) combining the crowdsourced data into one view using potentially other data sources as well. Using two examples, namely land cover validation and inundation extent estimation, this paper discusses the close links between QA and DCDF in order to determine whether a disentanglement can be beneficial or not to a better understanding of the data curation process and to its methodology with respect to crowdsourcing data. Far from rejecting the usability quality criterion, the paper advocates for a decoupling of the QA process and the DCDF step as much as possible but still in integrating them within an approach analogous to a Bayesian paradigm.
This chapter considers the potential for citizen science to contribute to policy development. A background to evidence-based policy making is given, and the requirement for data to be robust, reliable and, increasingly, cost-effective is noted. The potential for the use of ‘co-design' strategies with stakeholders, to add value to their engagement as well as provide more meaningful data that can contribute to policy development, is presented and discussed. Barriers to uptake can be institutional and the quality of data used in evidence-based policy making will always need to be fully assured. Data must be appropriate to the decision making process at hand and there is potential for citizen science to fill important, existing data-gaps.
Environmental policy involving citizen science (CS) is of growing interest. In support ofthis open data stream of information, validation or quality assessment of the CS geo-located data to their appropriate usage for evidence-based policy making needs a flexible and easily adaptable data curation process ensuring transparency. Addressing these needs, this paper describes an approach for automatic quality assurance as proposed by the Citizen OBservatory WEB (COBWEB) FP7 project. This approach is based upon a workflow composition that combines different quality controls, each belonging to seven categories or "pillars". Each pillar focuses on a specific dimension in the types of reasoning algorithms for CS data qualification. These pillars attribute values to a range of quality elements belonging to three complementary quality models. Additional data from various sources, such as Earth Observation (EO) data, are often included as part of the inputs of quality controls within the pillars. However, qualified CS data can also contribute to the validation of EO data. Therefore, the question of validation can be considered as "two sides of the same coin". Based on an invasive species CS study, concerning Fallopia japonica (Japanese knotweed), the paper discusses the flexibility and usefulness of qualifying CS data, either when using an EO data product for the validation within the quality assurance process, or validating an EO data product that describes the risk of occurrence of the plant. Both validation paths are found to be improved by quality assurance of the CS data. Addressing the reliability of CS open data, issues and limitations of the role of quality assurance for validation, due to the quality of secondary data used within the automatic workflow, are described, e.g., error propagation, paving the route to improvements in the approach.
27Based on an invasive species CS study, concerning Fallopia japonica (Japanese knotweed), the paper 28 discusses the flexibility and usefulness of qualifying CS data, either when using an EO data for the 29 validation within the quality assurance process, or validating an EO data product that describes the 30 risk of occurrence of the plant. Both validation paths are found to be improved by quality assurance 31 of the CS data. Addressing the reliability of CS open data, issues and limitations of the role of quality 32 assurance for validation, due to the quality of secondary data used within the automatic workflow, 33 are described, e.g. error propagation, paving the route to improvements in the approach. 34
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.