Detecting the early stages of failures is an old concern of petroleum industry. In order to tackle this problem, a novel sensor analysis methodology is proposed. The assessment of production sensors' behavior, individually or in a group, leads to a better understanding of failure modes during oil and gas production. Thus, Principal Components Analysis and Logistic Regression are incorporated as multivariate statistical modeling for studying the impact of different anomalies in production sensors. Therefore, a deep statistical analysis of these sensors can strengthen assumptions for supporting the modeling process of early fault detection systems. Based on a reliable public data set containing data from real wells, the application of the PCA approach combined with a Logistic Regression resulted in better visualization and understanding of some failures that occurred during petroleum production, such as the abrupt increase in BSW (Basic sediment and water), spurious closure of DHSV (Down hole Safety Valve), severe slugging, flow instability, productivity loss, quick restriction in PCK (production choke), scaling in PCK and hydrate formation in production lines. The two statistical approaches were used as a combined method to provide useful information regarding the failure modes in the dataset. Also, the dataset presented two classes that are important for anomaly detection in oil wells: "normal" and "abnormal", which allow detecting when production is outside its normal condition. Then, using the production sensors analysis with failure data can help to formulate better detection algorithms. By using PCA and Logistic Regression it was possible to identify which set of variables is better for detecting a specific type of problem. The application of these techniques boosts the modeling of early detection systems in oil and gas production. Besides, the assumptions led to conclusions about how to put groups of sensors and abnormalities together and how much time a well stands in a steady normal condition. Other conclusions showed the significance of transient information for fault detection modeling and the need for individual wells analyses. Hence, using PCA for treating and transforming the data brings important contributions for early fault detection modeling, once it allowed insight into how sensors and abnormal events can be related. Consequentially, the present paper has significant novelty contribution: it raises important assumptions that help to build solid knowledge about the anomalies behavior and help researchers to implement a better modeling strategy.
In the petroleum industry, sensor data and information are valuable. It can detect, predict and help to understand processes during oil production. Offshore wells require more attention. Once workovers, maintenance, and intervention are more costly than onshore wells. Coupling data-driven methods for well-monitoring applications, two unsupervised classification methods, one statistical and one machine learning-based, are proposed to detect anomalies in well data. The novelty is presented by applying a Control Chart using a 3 standard deviations window for the Permanent Downhole Gauge Pressure sensor (P-PDG), and a Fuzzy C-means algorithm to classify data from pressure and temperature sensors in an offshore field. The main goal in structuring a classified data set is using it to train machine learning models to monitor and manage petroleum production. Modeling applications for early fault detection systems in offshore production, based on real-time data from production sensors, require classified data sets. Then, labeling two target classes: "normal" and "fault" is a key step to be implemented in order to train the machine learning models. Therefore, this paper applies two methodologies to classify a real-time data set to create a training data set divided into "normal" and "fault" classes. Thus, it is possible to visualize the abnormal events pointed out by the methodologies and compare how sensible is each method. In addition, it is proposed a random forest application to test the performance of the classified data sets from both methods. The results have shown that the control chart method presents higher sensibility than fuzzy c-means, however, the differences between are insignificant. The random forest performance displayed sensitivity and specificity values of 99.91% and 100% for the data set classified by the control chart method and 94.01% and 99.98% for the data set classified by fuzzy c-means algorithm.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.