Background: With 75% of patients with non-small cell lung cancer (NSCLC) being found at an intermediate to advanced stage and a five-year survival rate of only 7%-17%, there is a need to find ways to improve the five-year survival rate of patients with NSCLC for prognosis. We used bioinformatics analysis of NSCLC samples from The Cancer Genome Atlas (TCGA) database to screen for differential genes and find multigene models for risk assessment of NSCLC patients, which is important for individualised clinical treatment and prognosis of NSCLC patients. Considering the limitations of the samples in this study, further validation in clinical and basic experiments is needed.
Methods and results: The 519 samples associated with NSCLC were screened using bioinformatics in TCGA database, and the differential genes were selected by univariate analysis and Least Absolute Shrinkage and Selection Operator (LASSO) regression model. The most effective multi-gene model was selected by multi-gene analysis, and the validity of the multi-gene model was verified by survival analysis and Receiver Operating Characteristic (ROC) curves, and finally by the Kyoto Encyclopedia of Genes and Genomes (KEGG) database and The mRNA differential genes were enriched KEGG and Gene Ontology (GO) databases. The GO enrichment analysis showed that the differential genes were associated with extracellular structural tissues, external encapsulated structural tissues and extracellular matrix tissues. enrichment indicated that the differential genes were associated with histidine metabolism, calcium signalling pathways and cytokine-cytokine receptor interactions, among others. In conclusion, a polygenic model consisting of 22 genes can be used as a tool for the prognosis of NSCLC.
Conclusion: Polygenic models provide an ideal and effective approach to the prognosis of NSCLC. In this study, we screened a set of multigene models as a risk assessment model for the prognosis of NSCLC.