2023
DOI: 10.3390/en16104159
|View full text |Cite
|
Sign up to set email alerts
|

Prediction of TOC Content in Organic-Rich Shale Using Machine Learning Algorithms: Comparative Study of Random Forest, Support Vector Machine, and XGBoost

Abstract: The total organic carbon (TOC) content of organic-rich shale is a key parameter in screening for potential source rocks and sweet spots of shale oil/gas. Traditional methods of determining the TOC content, such as the geochemical experiments and the empirical mathematical regression method, are either high cost and low-efficiency, or universally non-applicable and low-accuracy. In this study, we propose three machine learning models of random forest (RF), support vector regression (SVR), and XGBoost to predict… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(2 citation statements)
references
References 60 publications
0
2
0
Order By: Relevance
“…Classification and regression trees can be used in multivariate analysis to enable the study of the relationship between the dependent variable and independent variables measured on the weak scale, i.e., nominal or ordinal, and the strong scale, i.e., interval and quotient. They are a visual representation of the model [40]:…”
Section: Decision Treesmentioning
confidence: 99%
“…Classification and regression trees can be used in multivariate analysis to enable the study of the relationship between the dependent variable and independent variables measured on the weak scale, i.e., nominal or ordinal, and the strong scale, i.e., interval and quotient. They are a visual representation of the model [40]:…”
Section: Decision Treesmentioning
confidence: 99%
“…In recent years, with the rapid development of artificial intelligence technology, many scholars have used machine-learning algorithms and trained many data to try to find out the characteristic signals of POC in visible and near-infrared spectra to predict the concentrations of different forms of carbon in water bodies [17][18][19]. Machine-learning models have powerful feature learning capabilities, can automatically learn complex spatial and spectral features from satellite remote sensing data, and effectively capture nonlinear relationships through multi-level data transformation and feature extraction, thus improving the performance of classification and identification, which has become a hot spot in the research of remote sensing inversion of water quality parameters, such as partial least squares regression (PLSR) [20], artificial neural networks (ANN) [21], support vector machines (SVM) [22], and convolutional neural network (CNN) [23], which have good performance in predicting the concentration of POC in water.…”
Section: Introductionmentioning
confidence: 99%