The last decade has spotted a tremendous upsurge in casing failures. The aftermaths of casing failure can include the possibility of blowouts, environmental pollution, injuries/fatalities, and loss of the entire well to name a few. The motivation behind this work is to present findings from a predictive analytics investigation of casing failure data using supervised and unsupervised data mining algorithms. Scientists and researchers have speculated the potential underlying causes of failure but to date this type of work remains unpublished and unavailable in the public domain literature. The study assembled comprehensive data from eighty land-based wells during drilling, fracturing, workover, and production operations. Twenty wells suffered from casing failure while the remaining sixty offset wells were compiled from well reports, fracturing treatment data, drilling records, and recovered casing data. The failures were unsystemic but included fatigue failure, bending stresses from excessive dogleg, buckling, high hoop stress on connections, and split coupling. The failures were detected at various depths, both in cemented and uncemented hole sections. Failures were spotted at the upper and lower production casing. Using a predictive analytics software from SAS, twenty-six variables were evaluated through the application of various data mining techniques on the failed casing data points. The missing data was accounted for using multivariate normal imputation. The study outcome addressed common casing sizes and couplings involved with each failure, failure location, hydraulic fracturing stages, cement impairment, dogleg severity, thermal and tensile loads, production-induced shearing, and DLS. The predictive algorithms used in this study included Logistic Regression, supervised Hierarchal Clustering, and Decision Trees. While the descriptive analytics manifested in visual representations included Scatterplot Matrices and PivotTables. A combination of the causes of failure were identified. A total of five statistical techniques using the aforementioned algorithms were developed to evaluate the concurrent effect of the interplay of these variables. Nineteen variables were believed to possess a high contribution to failure. Scatterplot matrix suggested a complex correlation between the total base water used in fracturing simulation and casing thickness. Logistic Regression suggested nine variables were significant including: TVD, operator, frac start month, MD of most severe DL, heel TVD, hole size, BHT, total proppant mass, cumulative DLS in lateral and build sections variables as significant failure contributors. PivotTables showed that the rate of casing failure was highest during the winter season. This investigation is aimed to develop a thorough understanding of casing failures and the myriad of contributing factors to develop comprehensive predictive models for future failure prediction via the application of data mining algorithms. These models intend to provide a theoretical and statistical basis for cost-effective, safe, and better drilling practices.
Drilling problems such as stick slip vibration/hole cleaning, pipe failures, loss of circulation, BHA whirl, stuck pipe incidents, excessive torque and drag, low ROP, bit wear, formation damage and borehole instability, and the drilling of highly tortuous wells have only been tackled using physics-based models. Despite the mammoth generation of real-time metadata, there is a tremendous gap between statistical based models and empirical, mathematical, and physical-based models. Data mining techniques have made prominent contributions across a broad spectrum of industries. Its value is widely appreciated in a variety of applications, but its potential has not been fully tapped in the oil and gas industry. This paper presents a review compiling several years of Data Analytics applications in the drilling operations. This review discusses the benefits, deficiencies of the present practices, challenges, and novel applications under development to overcome industry deficiencies. This study encompasses a comprehensive compilation of data mining algorithms and industry applications from a predictive analytics standpoint using supervised and unsupervised advanced analytics algorithms to identify hidden patterns and help mitigate drilling challenges. Traditional data preparation and analysis methods are not sufficiently capable of rapid information extraction and clear visualization of big complicated data sets. Due to the petroleum industry's unfulfilled demand, Machine Learning (ML)-assisted industry workflow in the fields of drilling optimization and real time parameter analysis and mitigation is presented. This paper summarizes data analytics case studies, workflows, and lessons learnt that would allow field personnel, engineers, and management to quickly interpret trends, detect failure patterns in operations, diagnose problems, and execute remedial actions to monitor and safeguard operations. The presence of such a comprehensive workflow can minimize tool failure, save millions in replacement costs and maintenance, NPV, lost production, minimize industry bias, and drive intelligent business decisions. This study will identify areas of improvement and opportunities to mitigate malpractices. Data exploitation via the proposed platform is based on well-established ML and data mining algorithms in computer sciences and statistical literature. This approach enables safe operations and handling of extremely large data bases, hence, facilitating tough decision-making processes.
The objective of this work is to further explore the potential application of Machine Learning algorithms in production prediction and ultimate recovery. Intelligent Machine Learning Approaches such as Gradient Boosted Trees (GBT), Adaboost, and Support Vector Regressor (SVR) are applied to detect the most important features contributing to cumulative production prediction within the first 12 producing months. The models are applied on a data set composed of 5 wells in the Volve field in the North Sea. The collected data was then filtered and used to structure and train the different Regression algorithms and fine tune the appropriate hyperparameters. The different models were All models were evaluated by measuring the Mean Absolute Error (MAE). The generalization and precision performance of the proposed models are established by comparing the forecasted outcome after cross validation with field data. The optimized model can predict production response with high accuracy. The data-fitting process comprises of splitting the data into training using 70% of the data set, 15% validation, and 15% testing. Constructing a regression model on the training set and validating it with the test set. Recurrent application of a "cross-validation" process produces important information concerning the robustness of any regression-modeling method. Six parameters were considered as input factors to build the model. Factors affecting production prediction included on stream hours, average choke size, bore oil volume, bore gas volume, bore water volume, average wellhead pressure were used as input. The outcome showed that the developed model provided better prediction compared to analytical models with a 11.71% MAE prediction for SVR. This novel data mining application could be trained on any dataset to help predict future production performance at any conditions in any given scenario.
Despite numerous studies in the subject matter, industry has yet to resolve casing failure issues. A more interdisciplinary approach is taken in this study integrating seventy-eight land based wells using a data - driven approach to predict the reasons behind casing failure. This study uses a statistical software in collaboration with Python Scikit-learn implementation to apply different Data Mining and Machine Learning algorithms on twenty-four different features on the twenty failed casing data sets. Descriptive analytics manifested in visual 8representations included Normal Distribution Charts and Heat Map. Principal component Analysis (PCA) was used for dimensionality reduction. Supervised and unsupervised approaches were selected respectively based on the response. The algorithms used in this study included Support Vector Machine (SVM), Boot strap, Random Forest, Naïve Bayes, XG Boost, and K-Means Clustering. Nine models were then compared against each other to determine the winner. Features contributing to casing failure were identified based on best algorithm performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.