The Marcellus Shale has more than a decade of development history. However, there are many questions that still remain unanswered. What is the best inter-well spacing? What are the optimum stage length, proppant loading, and cluster spacing? What are the ideal combinations of these completion parameters? And how can we maximize the rate return on our investment? This study proposes innovative tools that allow researchers to answer these questions. We build these set of tools by utilizing the pattern recognition abilities of machine learning algorithms and public data from the Southwestern Pennsylvania region of the Marcellus Shale.
By means of artificial intelligence and data mining techniques, we studied a database that includes public data from more than 2,000 wells producing from the aforementioned study area. The database contained completion, drilling, and production history information from various operators active in Allegheny, Greene, Fayette, Washington, and Westmoreland counties located in the Southwestern Pennsylvania. Extensive preprocessing and data cleansing steps were involved to prepare the database. Various machine learning techniques (Linear Regression (LR), Support Vector Machines (SVMs), Artificial Neural Networks (ANNs), and Gaussian Processes (GP)) were applied to understand the non-linear patterns in the data. The objective was to develop predictive models that were trained and validated based on the current database. The predictive models were validated using information originating from numerous wells in the area. Once validated, the model could be used in reservoir management decision-making workflows to answer questions such as what are the best drilling scenarios, the optimum hydraulic fracturing design, the initial production rate, and the estimated ultimate recovery (EUR). The workflow is purely based on field data and free of any cognitive human bias. As soon as more data is available, the model could be updated. The core data in this workflow is sourced from public domains, and therefore, intensive preprocessing efforts were necessary.