The emergence of "big data" has encouraged the utilization of data from various origins to enhance the decision-making process. Unfortunately, multiphase flow studies are often performed in "silos" – within which specific experiments were performed and based on which certain model improvements were proposed. As such, it is easy to lose sight of the big picture of where we are in terms of our understanding and modeling capability. This disconnected approach has also produced an ever-growing, potentially unmanageable list of closure relationships, which can be counter-productive for model development. In this paper, we present exploratory data analyses to comprehensively evaluate the performance of a steady-state multiphase flow point model in predicting high-pressure near-horizontal data from independent experiments. This effort provides wide-ranging hindsight that can reflect the current state-of-the-art of multiphase flow modeling and pinpoint areas where improvements are needed.
First, relevant multiphase flow datasets from the literature are collected. In this paper, we limited the scope to near-horizontal and high-pressure data (gas density of 5 kg/m3 or higher). Then, we run a state-of-the-art model and compare its prediction against these datasets. Multidimensional discrepancy plots are presented to map the models’ performance for pressure drop and holdup predictions across the selected scaling variables. Violin plots are used to identify and analyze the outliers with respect to modeling errors. Confusion matrices are used to quantitatively analyze the model performance in predicting flow patterns, eliminating the restriction of traditional flow pattern map analysis that is limited to qualitative assessment at constant pipe and fluid properties. Finally, the accuracies of key closure relationships are also evaluated.
The multidimensional discrepancy plots highlight the conditions where the model performs poorly: low-liquid loading upward flow, downward flow, and high gas flow rates. The violin plots enable quick identification of outliers, which can represent both model and measurement deficiencies. The confusion matrix indicates that the transition between stratified and annular flow is very poorly predicted. The misclassification between stratified and intermittent flow comes at a distant second in terms of occurrence frequency; however, it contributes more significantly to the bulk parameters prediction errors. Except for the slug translational velocity, most closure parameters are still poorly predicted. Entrainment fraction deserves special attention given the expected importance of it on the stratified flow model accuracy. The closure relationships for slug characteristics are unable to predict pseudo-slug flow data accurately.
This paper presents several Exploratory Data Analysis (EDA) techniques that enable comprehensive analyses of several independent datasets from various origins. The analyses provide actionable and more general insights that would be otherwise obscured if individual datasets are analyzed in silos, such as operating conditions where higher uncertainty margins need to be applied and where further modeling improvements are desirable.