Accurate prediction of geological formation tops is a crucial task for optimizing hydrocarbon exploration and production activities. This research investigates and conducts a comprehensive comparative analysis of several advanced machine learning approaches tailored for the critical application of geological formation top prediction within the complex Norwegian Continental Shelf (NCS) region. The study evaluates and benchmarks the performance of four prominent machine learning models: Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Random Forest ensemble method, and Multi-Layer Perceptron (MLP) neural network. To facilitate a rigorous assessment, the models are extensively evaluated across two distinct datasets - a dedicated test dataset and a blind dataset independent for validation. The evaluation criteria revolve around quantifying the models' predictive accuracy in successfully classifying multiple geological formation top types. Additionally, the study employs the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm as a baseline benchmarking technique to contextualize the relative performance of the machine learning models against a conventional clustering approach. Leveraging two model-agnostic feature importance analysis techniques - Permutation Feature Importance (PFI) and Shapley Additive exPlanations (SHAP), the investigation identifies and ranks the most influential input variables driving the predictive capabilities of the models. The comprehensive analysis unveils the MLP neural network model as the top-performing approach, achieving remarkable predictive accuracy with a perfect score of 0.99 on the blind validation dataset, surpassing the other machine learning techniques as well as the DBSCAN benchmark. However, the SVM model attains superior performance on the initial test dataset, with an accuracy of 0.99. Intriguingly, the PFI and SHAP analyses converge in consistently pinpointing depth (DEPT), revolution per minute (RPM), and Hook-load (HKLD) as the three most impactful parameters influencing model predictions across the different algorithms. These findings underscore the potential of sophisticated machine learning methodologies, particularly neural network-based models, to significantly enhance the accuracy of geological formation top prediction within the geologically complex NCS region. However, the study emphasizes the necessity for further extensive testing on larger datasets to validate the generalizability of the high performance observed. Overall, this research delivers an exhaustive comparative evaluation of state-of-the-art machine learning techniques, offering critical insights to guide the optimal selection, development, and real-world deployment of accurate and reliable predictive modeling strategies tailored for hydrocarbon exploration and reservoir characterization endeavors in the NCS.
Graphical abstract