Performance evaluations of oil and gas assets are crucial for continuously improving operational efficiency in the mainstream petroleum industry. The success of such evaluations is largely driven by the analysis of the data accumulated during the asset's operational cycle. Usually, the amount of data stored in the databases dramatically exceeds the ability to approach the analysis with traditional spreadsheet-based tools or linear modeling. In this study we use data mining with multivariate predictive analytics and monetize on the value of data by transforming the inferred information into knowledge and further into rigorous business decisions.
With the expansion of the Digital Oil Field and transformation into the 4th Industrial Revolution, the oil and gas industry is acquiring tremendous amounts of data that come from disparate sources in a variety of origins, time scales, structures and quality. The underlying variable root-cause relationships are highly non-linear and non-intuitive, and simplistic linear regression methods are suboptimal. We approach the challenge by developing a data-driven workflow that integrates components of artificial intelligence, machine learning and pattern recognition to enhance quantitative understanding of complex data.
The sanitized aggregated data set combines 470 horizontal wells, covering 15 numerical (e.g., stimulation interval length, production rates) and categorical (e.g., target zone, proppant type) predictors and the total produced BOE, as the response variable. The objective is to predict an optimal set of variables that maximize the production. We utilize an integrated analytics platform that enables a variety of sophisticated statistical operations on large-scale data: a) comprehensive data QA/QC for outliers, consistency and missing entries; b) Exploratory Data Analysis and visualization; c) feature selection, screening and ranking; d) building and training of multiple machine learning (ML) models for multi-variate regression (e.g. generalized linear model, deep learning, decision tree, random forest and gradient boosted machine); and e) response optimization of an identified "best-performing" ML model for highest prediction accuracy.
Our study introduces the initiative to establish concepts best practices for predictive and prescriptive analytics in domains of reservoir simulation, description and asset management. Given the unique volume and information richness of operational data, acquired over decades of production history, the anticipated applications of predictive analytics could expand to drilling optimization, smart data aggregation, well stimulation and equipment maintenance.