2020
DOI: 10.1002/cem.3231
|View full text |Cite
|
Sign up to set email alerts
|

Different strategies for the use of random forest in NMR spectra

Abstract: Nuclear magnetic resonance (NMR) can provide a large amount of information about an analyzed sample; however, its spectra contain above 6000 variables, making it difficult for random forest (RF) applications. Reducing the size of the original dataset can minimize this problem. In this paper, we compared RF classification models obtained with full NMR spectral range and from the reduction of NMR variables, using principal component analysis (PCA) and the Fisher discriminant (FD). Then, the variables used in the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 25 publications
0
4
0
Order By: Relevance
“…The RF has received increasing attention for efficient variable selection among other machine learning techniques. We refer to Afanador et al, 1 Behnamian et al, 2 Li et al, 3 Speiser et al, 4 and Lovatti et al, 5 to mention a few. However, to the best of our knowledge, this efficient machine learning technique for variable selection is not used with the Liu regression method in none of the related existing studies.…”
Section: Introductionmentioning
confidence: 99%
“…The RF has received increasing attention for efficient variable selection among other machine learning techniques. We refer to Afanador et al, 1 Behnamian et al, 2 Li et al, 3 Speiser et al, 4 and Lovatti et al, 5 to mention a few. However, to the best of our knowledge, this efficient machine learning technique for variable selection is not used with the Liu regression method in none of the related existing studies.…”
Section: Introductionmentioning
confidence: 99%
“…Then, Fisher's discriminant analysis was used for variable selection to nd the spectral regions responsible for maximising the region separation or leaf collection times and minimising the sample separation from the same class. 53 Finally, the selected spectrum regions were submitted to principal component analysis (PCA) to identify the bands responsible for distinguishing the locations and leaf collection times of the day. The models were built using MATLAB R2015a (MathWorks Inc., Natick, MA, USA).…”
Section: Pca Analysesmentioning
confidence: 99%
“…RF is an algorithm that combines Classification and regression trees (CART) and bootstrapping aggregation (bagging) algorithms. RF applications to solve classification problems of large data and a larger sample set can generate a high computational cost in addition to requiring more time [27]. The detailed descriptions of the SVM, DT, and RF algorithms are described elsewhere and therefore not shown here.…”
Section: Machine Learning Algorithmsmentioning
confidence: 99%