A significant proportion of capital and operational expenditures of oil and gas companies falls on the well construction. Unexpected situations inevitably happen during drilling regardless of the well's construction technology level and available information. These situations lead to more spending and noon-productive time. We present a machine learning (ML) algorithm for predicting accidents such as stuck, mud loss, and fluid show as the most common accidents in the industry.
The model for forecasting the drilling accidents is based on the Bag-of-features approach, which implies labeling segments of surface telemetry data by the particular symbol, named codeword, from the defined codebook. Building histograms of symbols for the one-hour telemetry interval, one could use the histogram as an input for the machine learning algorithm. For the ML model training, we use data from more than 100 drilling accidents from different oil and gas wells, where we defined more than 3000 drilling accident predecessors and about 5000 normal drilling segments.
Model performance was estimated using two major metrics.The coveragemetric, indicates the ratio of true forecasted events. Number of false alarms per day metricfor the specified probability threshold. Using different schemes of metric calculation, one could evaluate the model's ability to both forecast and detect accidents. Validation tests justify that our algorithm performs well on historical and real-time data. At each moment, the model analyzes the real-time data for the last hour and provides the probability of whether the segments contain the signs of drilling accident predecessors of a particular type. The prediction quality does not vary from field to field, so the ML model can be used in different fields without additional training.
Nowadays model is tested in real oilfields in Russia. To operate the model, we developed software integrated with the Wellsite Information Transfer Standard Markup Language (WITSML) data server into clients' existing IT infrastructure. All calculations arein the cloud anddo not require significant additional computing power on client side.