Background
Recent years, attributed to early detection and new therapies, the mortality rates of breast cancer (BC) decreased. Nevertheless, the global prevalence was still high and the underlying molecular mechanisms were remained largely unknown. The investigation of prognosis-related genes as the novel biomarkers for diagnosis and individual treatment had become an urgent demand for clinical practice.
Methods
Gene expression profiles and clinical information of breast cancer patients were downloaded from The Cancer Genome Atlas (TCGA) database and randomly divided into training (n = 514) and internal validation (n = 562) cohort by using a random number table. The differentially expressed genes (DEGs) were estimated by Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis. In the training set, the gene signature was constructed by the least absolute shrinkage and selection operator (LASSO) method based on DEGs screened by R packages. The results were further tested in the internal validation cohort and the entire cohort. Moreover, functions of five genes were explored by MTT, Colony-Formation, scratch and transwell assays. Western blot analysis was used to explore the mechanisms.
Results
In the training cohort, a total of 2805 protein coding DEGs were acquired through comparing breast cancer tissues (n = 514) with normal tissues (n = 113). A risk score formula involving five novel prognostic associated biomarkers (EDN2, CLEC3B, SV2C, WT1 and MUC2) were then constructed by LASSO. The prognostic value of the risk model was further confirmed in the internal validation set and the entire set. To explore the biological functions of the selected genes, in vitro assays were performed, indicating that these novel biomarkers could markedly influence breast cancer progression.
Conclusion
We established a predictive five-gene signature, which could be helpful for prognosis assessment and personalized management in breast cancer patients.