Abstract. In this paper, we introduce the CrowdTruth open-source software framework for machine-human computation, that implements a novel approach to gathering human annotation data in a wide range of annotation tasks and on a variety of media (e.g. text, images, videos). The CrowdTruth approach captures human semantics through a pipeline of three processes: a) combining various machine processing of text, image and video in order to understand better the input content and optimise its suitability for micro-tasks, thus optimise the time and cost of the crowdsourcing process; b) providing reusable human-computing task templates to collect the maximum diversity in the human interpretation, thus collect richer human semantics; and c) implementing 'disagreement metrics', i.e. CrowdTruth metrics, to support deep analysis of the quality and semantics of the crowdsourcing data. Instead of the traditional inter-annotator agreement, we use their disagreement as a useful signal to evaluate the data quality, ambiguity, and vagueness. In this paper we demonstrate the innovative CrowdTruth approaches embodied in the software to: 1) support processing of different text, image and video data; 2) support a variety of annotation tasks; 3) harness worker disagreement with CrowdTruth metrics; and 4) provide an interface to support data analysis and visualisation. In previous work we introduced the CrowdTruth methodology with examples for semantic interpretation of medical text for relation and factor extraction, and with newspaper text for event extraction. In this paper, we demonstrate the applicability and robustness of the approach to a wide variety of problems across a number of domains. We also show the advantages of using open standards and the extensibility of the framework with new data modalities and annotation tasks, as well as its openness to external services.
This paper presents an unbiased Stochastic Workflow (SW), data driven, where surface and subsurface uncertainties are accounted for and their impact on Facilities design and operational decisions quantified. Unlike the traditional approach in Facilities design where typically the ‘most conservative values' are used as design input variables, the proposed workflow accounts for lifecycle variability and correlations of relevant input data. The workflow enables superior risk management and resources allocation. An example is provided, where the traditional Facilities design outcomes are compared with the Stochastic Workflow findings. Deterministic Models are established to account for the dependencies between design input variables (Static Variables, i.e. bottom hole pressure and temperature) and the desired objective (Static Results, i.e. chemical injection rate). However, in real life situations, the analyzed variables change due to subsurface and surface events with different levels of uncertainty (i.e. condensate banking, lean gas injection, water breakthrough). Stochastic algorithms are used to create Probability Distribution Functions (PDF) for all analyzed design input variables (Stochastic Variables). Stochastic Algorithms are then applied on the Deterministic Model, sampling from the previously defined probability distributions. Stochastic Results are assembled into insightful charts and used to analyze the most relevant variables and their correlations affecting the model objectives. The workflow provides an objective quantification of risks and uncertainties impacting the design and operation of the analyzed system. The deterministic design approach in the example permits for risk to be still present in 11% of cases and resources to be wasted in 77% of cases. In the revised design, based on the Stochastic Workflow, the risk and wastage are reduced to less than 1%. The associated OPEX component is reduced from USD 12 mln to USD 8 mln (-33%), expressed in Present Value terms. This paper contributes to the efforts of bridging the gap between subsurface and surface disciplines, and demonstrates the utility of integrated approach in Facilities planning, where both subsurface and surface uncertainties are accounted for. This approach contributes to the elimination of subjective decision biases (Waring, 2017), enabling superior Project and Asset Management. The proposed Stochastic Workflow is scalable and transferable, and suited to collaborative, multidisciplinary project and asset teams. Additional benefits of the Stochastic Workflow, such as improved Well and Reservoir Management (Virtual PLT) or increased system availability, are also mentioned in the paper.
The paper asserts that traditional Facilities planning methodologies, heavily based on Design Basis documents and biased towards the ‘most conservative conditions’, fail to recognize the entirety of operational conditions throughout the oilfield lifecycle, leading to significant residual risk and the wastage of resources in the Operations stage. An integrated stochastic approach is proposed, accounting for both subsurface and surface uncertainties and their interrelations throughout the field life. A practical example is also provided. The proposed methodology employs Big Data algorithms to quantify subsurface and surface uncertainties and produce Probability Distribution Functions (PDF) for all relevant variables. Data sources are diverse, normally available in a project team, and may include geological and reservoir models and production forecasts, PVT reports, nodal and network analyses, and environmental databases. The relevant variables are expected to change throughout field life with different levels of uncertainty, due to planned and unplanned events such as condensate banking, lean gas injection for pressure maintenance, water breakthrough, etc. Stochastic algorithms (e.g. Monte Carlo method) are then applied on the deterministic model, randomly sampling from the previously defined probability distributions. Results are summarized as Tornado charts and spider or scatter plots, which are then used to analyze the most relevant variables and correlations affecting the objective function. The provided example results indicate the risk is still present in 9% of cases for the original, deterministic design. Furthermore, resources are wasted in 76% of cases. In the integrated stochastic design the risks and wastage are reduced down to a range of 0 to 1%. The associated OPEX components are reduced from USD 3 mln to USD 1 mln (-66%), expressed in Present Value terms. An unquantified increase in system availability is also noted. The paper demonstrates the utility of integrated, stochastic approach in Facilities planning, accounting for both subsurface and surface uncertainties and their interrelations throughout the field’s life. This approach eliminates the Design Basis induced bias and enables superior decisions at Project and Asset levels. The proposed approach is scalable, transferable to other oilfield challenges, and it is suited to multidisciplinary work environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.