2020
DOI: 10.1038/s41598-020-64870-z
|View full text |Cite
|
Sign up to set email alerts
|

Correlation-centred variable selection of a gene expression signature to predict breast cancer metastasis

Abstract: predictions of distant cancer metastasis based on gene signatures are studied intensively to realise precise diagnosis and treatments. Gene selection i.e. feature selection is a cornerstone to both establish accurate predictions and understand underlying pathologies. Here, we developed a simple but robust feature selection method using a correlation-centred approach to select minimal gene sets that have both high predictive and generalisation abilities. A multiple logistic regression model was used to predict … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 31 publications
0
6
0
Order By: Relevance
“…TFs with non-zero values and a significant ( p < 0.001) effect on PD-L1 expression were included in the final MLR model. As a side note, although we utilized only fully independent target genes in our estimates of TF activity, potential multicollinearity of the involved genes might still weaken the model’s generalization ability 36 . However, multicollinearity analysis by calculation of the Variation Inflation Factor (VIF) values of the model’s independent variables resulted in VIFs lower than 4, indicating that the selected descriptors were not highly correlated (Supplementary file 1 , Table S1 ).…”
Section: Resultsmentioning
confidence: 99%
“…TFs with non-zero values and a significant ( p < 0.001) effect on PD-L1 expression were included in the final MLR model. As a side note, although we utilized only fully independent target genes in our estimates of TF activity, potential multicollinearity of the involved genes might still weaken the model’s generalization ability 36 . However, multicollinearity analysis by calculation of the Variation Inflation Factor (VIF) values of the model’s independent variables resulted in VIFs lower than 4, indicating that the selected descriptors were not highly correlated (Supplementary file 1 , Table S1 ).…”
Section: Resultsmentioning
confidence: 99%
“…VIF is an abbreviation for the variance inflation factor test, which is used to determine the existence of a multicollinearity issue in the OLS model. If VIF exceeds a threshold value of 10 then it results in misleading T -statistic values (Hikichi et al , 2020);*** p < 0.01,** p < 0.05,* p < 0.1 are considered significant when the T -statistics in brackets are corrected for heteroscedasticity. Table 2 provides definitions for all variables…”
Section: Figurementioning
confidence: 99%
“…VIF is an abbreviation for the variance inflation factor test, which is used to determine the existence of a multicollinearity issue in the OLS model. If VIF exceeds a threshold value of 10 then it results in misleading T-statistic values(Hikichi et al, 2020); *** p < 0.01, ** p < 0.05, * p < 0.1 are considered significant when the T-statistics in brackets are corrected for heteroscedasticity.Table 2 provides definitions for all variables IFRS mandatory adoption financial reports and establishing defined measurement and recognition requirements. Adopting IFRS increases the accuracy of financial reporting in IPO prospectuses and reduces investor ex ante uncertainty regarding the proper valuation of the IPO offering price, minimizing information asymmetry and IPO underpricing.…”
mentioning
confidence: 99%
“…To solve the multicollinearity of genes and improve the generalization ability of the model, Hikichi et al (91) applied the same dataset as which Wang et al (90) used to build 76-Gene signature to develop a simple but robust feature selection method using a correlation-centered approach. They obtained a 12-gene set with both high predictive and generalization abilities and constructed a prediction model for 5-year DM in patients with early-stage breast cancer, of which the prediction efficiency was similar to that of Rotterdam 76-gene assay.…”
Section: Other Special Gene-expression Assaysmentioning
confidence: 99%