Crude oils are among the world’s most complex organic mixtures containing a large number of unique components and many analytical techniques lack resolving power to characterize. Fourier transform ion cyclotron resonance mass spectrometry offers a high mass accuracy, making a detailed analysis of crude oils possible. Infrared (IR) spectroscopic methods such as Fourier transform IR spectroscopy (FT-IR) and near-IR, can also be used for crude oil characterization. The three methods measure different properties of the samples, and different data sources can often be combined to improve the prediction accuracy of models. In this study, partial least squares regression (PLSR) models for each of the three methods (single-block PLSR) were compared to multiblock PLSR and sequential and orthogonalized PLSR (SO-PLSR), with the aim of predicting the density of crude oils. Variable importance in projection was used to identify the important variables for each method, as spectroscopic data often contain irrelevant variation. The variables were interpreted to evaluate their underlying chemistry and to check whether consistency could be found between the variables selected from the spectroscopic data for the single-block and multiblock methods. Combining the different blocks of data increased the prediction abilities of the models both before and after variable selection, and SO-PLSR using a reduced data set resulted in the best-performing prediction model.
The blockages of pipelines caused by agglomeration of gas hydrates is a major flow assurance issue in the oil and gas industry. Some crude oils form gas hydrates that remain as transportable particles in a slurry. It is commonly believed that naturally occurring components in those crude oils alter the surface properties of gas hydrate particles when formed. The exact structure of the crude oil components responsible for this surface modification remains unknown. In this study, a successive accumulation and spiking of hydrate-active crude oil fractions was performed to increase the concentration of hydrate related compounds. Fourier Transform Ion Cyclotron Resonance Mass Spectrometry (FT-ICR MS) was then utilised to analyse extracted oil samples for each spiking generation. Machine learning-based variable selection was used on the FT-ICR MS spectra to identify the components related to hydrate formation. Among six different methods, Partial Least Squares Discriminant Analysis (PLS-DA) was selected as the best performing model and the 23 most important variables were determined. The FT-ICR MS mass spectra for each spiking level was compared to samples extracted before the successive accumulation, to identify changes in the composition. Principal Component Analysis (PCA) exhibited differences between the oils and spiking levels, indicating an accumulation of hydrate active components. Molecular formulas, double bond equivalents (DBE) and hydrogen-carbon (H/C) ratios were determined for each of the selected variables and evaluated. Some variables were identified as possibly asphaltenes and naphthenic acids which could be related to the positive wetting index (WI) for the oils.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.