2021
DOI: 10.1109/access.2021.3055992
|View full text |Cite
|
Sign up to set email alerts
|

Optimal Trees Selection for Classification via Out-of-Bag Assessment and Sub-Bagging

Abstract: The effect of training data size on machine learning methods has been well investigated over the past two decades. The predictive performance of tree based machine learning methods, in general, improves with a decreasing rate as the size of training data increases. We investigate this in optimal trees ensemble (OTE) where the method fails to learn from some of the training observations due to internal validation. Modified tree selection methods are thus proposed for OTE to cater for the loss of training observ… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 30 publications
(11 citation statements)
references
References 47 publications
0
11
0
Order By: Relevance
“…ML techniques were used to classify MTBC smearnegative pulmonary-TB patients by Mello et al [13]. e span of the study was 3 years from April 1, 1995, to December 31,1998. A toal of 551 cases were considered for the analysis recording current TB symptoms and radiological information such as cavities above the lungs.…”
Section: Related Worḱmentioning
confidence: 99%
See 1 more Smart Citation
“…ML techniques were used to classify MTBC smearnegative pulmonary-TB patients by Mello et al [13]. e span of the study was 3 years from April 1, 1995, to December 31,1998. A toal of 551 cases were considered for the analysis recording current TB symptoms and radiological information such as cavities above the lungs.…”
Section: Related Worḱmentioning
confidence: 99%
“…In the current research work, only demographic, medical, and psychological information is included, but for future work, chest X-rays and chest scans could also be considered for further insights. Additional feature selection methods [25][26][27][28][29] and classifiers [30,31] could also be used for further investigation. e same algorithms could also be extended for the prediction of binding of G-protein-coupled receptors (GPCRs) and ligands using machine learning algorithms [32] and biomedical image classification in a big data architecture [33].…”
Section: Conclusion and Recommendationmentioning
confidence: 99%
“…Recently, there have been advances in developing interpretational tools for RF and other black box techniques (Sies and Van Mechelen 2020;Ribeiro et al 2016). Others have adapted ensemble methods with trees to increase interpretability (Meinshausen 2010) and searched for optimal tree ensembles (Khan et al 2020(Khan et al , 2021. We note that both in our simulation experiments as well as in the benchmark data set, there was at least one dataset or condition for which SUBiNN outperformed RF.…”
Section: Conclusion and Discussionmentioning
confidence: 82%
“…The final set of genes, in that case, will be the combination of genes selected from all the clusters. Extending performance assessment of selected genes to other recent classification methods ( Khan et al, 2020a , Gul et al, 2018 ; Khanal et al, 2020 ; Khan et al, 2020b ) could further validate the proposed gene selection methods.…”
Section: Discussionmentioning
confidence: 93%