2021
DOI: 10.1186/s40537-021-00472-4
|View full text |Cite
|
Sign up to set email alerts
|

Determining threshold value on information gain feature selection to increase speed and prediction accuracy of random forest

Abstract: Feature selection is a pre-processing technique used to remove unnecessary characteristics, and speed up the algorithm's work process. A part of the technique is carried out by calculating the information gain value of each dataset characteristic. Also, the determined threshold rate from the information gain value is used in feature selection. However, the threshold value is used freely or through a rate of 0.05. Therefore this study proposed the threshold rate determination using the information gain value’s … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
37
0
1

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
1

Relationship

1
6

Authors

Journals

citations
Cited by 55 publications
(39 citation statements)
references
References 40 publications
1
37
0
1
Order By: Relevance
“…This study followed previous studies ( Prasetiyowati, Maulidevi & Surendro, 2021 ; Prasetiyowati, Maulidevi & Surendro, 2020a ; Prasetiyowati, Maulidevi & Surendro, 2020b ). The researchers began this study by using the Correlation-based Feature Selection (CBF) for feature selection.…”
Section: Introductionmentioning
confidence: 72%
See 3 more Smart Citations
“…This study followed previous studies ( Prasetiyowati, Maulidevi & Surendro, 2021 ; Prasetiyowati, Maulidevi & Surendro, 2020a ; Prasetiyowati, Maulidevi & Surendro, 2020b ). The researchers began this study by using the Correlation-based Feature Selection (CBF) for feature selection.…”
Section: Introductionmentioning
confidence: 72%
“…Random Forest is a classification algorithm based on the random selection of trees ( Gounaridis & Koukoulas, 2016 ; Prasetiyowati, Maulidevi & Surendro, 2020a ; Prasetiyowati, Maulidevi & Surendro, 2021 ), thereby making it uninformative as a tool used to build the decision tree ( Breiman, 2001 ; Prasetiyowati, Maulidevi & Surendro, 2021 ; Scornet, Biau & Vert, 2015 ). However, this process allows the selected feature to be uninformative.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…ML can select significant predictors and exclude collinear variables, whereas unsupervised ML uses all the predictors with the same weights. Weights of RSA traits affected ML models in numerous other studies [ 80 83 ]. In this study, we segmented root crowns and used RhizoVision Explorer to extract root traits for use in these models.…”
Section: Discussionmentioning
confidence: 99%