2023
DOI: 10.1038/s41598-023-33327-4
|View full text |Cite
|
Sign up to set email alerts
|

Prognostic model development for classification of colorectal adenocarcinoma by using machine learning model based on feature selection technique boruta

Abstract: Colorectal cancer (CRC) is the third most prevalent cancer type and accounts for nearly one million deaths worldwide. The CRC mRNA gene expression datasets from TCGA and GEO (GSE144259, GSE50760, and GSE87096) were analyzed to find the significant differentially expressed genes (DEGs). These significant genes were further processed for feature selection through boruta and the confirmed features of importance (genes) were subsequently used for ML-based prognostic classification model development. These genes we… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(5 citation statements)
references
References 44 publications
0
4
0
Order By: Relevance
“…[ 37,38 ] The genes included in their models were as follows: PTGER2, FGF2, IGFBP3, ANGPTL4, DKK1, WNT16, SPP1, ZNF532, COLEC12, DPP7/2, YWHAB, MCM4, FBXO46, GLP2R, VSTM2A, and so on. [ 37,39,40 ] In our model, we included directly nine key EIOGs, and finally got a mode with the highest AUC reached 0.788, which indicated that EIOS alone play an important role in the CRC progression. We also proposed that a better prognostic model could be established based on a comprehensive understanding all these genes.…”
Section: Discussionmentioning
confidence: 99%
“…[ 37,38 ] The genes included in their models were as follows: PTGER2, FGF2, IGFBP3, ANGPTL4, DKK1, WNT16, SPP1, ZNF532, COLEC12, DPP7/2, YWHAB, MCM4, FBXO46, GLP2R, VSTM2A, and so on. [ 37,39,40 ] In our model, we included directly nine key EIOGs, and finally got a mode with the highest AUC reached 0.788, which indicated that EIOS alone play an important role in the CRC progression. We also proposed that a better prognostic model could be established based on a comprehensive understanding all these genes.…”
Section: Discussionmentioning
confidence: 99%
“…Data was then transformed with the classical variance stabilizing transformation method. Finally, we retained protein coding genes and ncRNAs with a fold change of at least 2 and an adjusted P -value <0.001 or 0.05 for the HEFS standard mode or light mode, respectively ( 33–35 ).…”
Section: Methodsmentioning
confidence: 99%
“…The methods include none (all features in the space included), recursive feature elimination (RFE) until 5 features, and the boruta feature selection method [ 18 ]. The boruta method was evaluated due to the effectiveness of the approach in previous studies in the medical domain [ 19 23 ]. The boruta method utilised RF and XGBoost models respectively when they were being trained.…”
Section: Methodsmentioning
confidence: 99%