Objective
To investigate the bioinformatics analysis methods of genes associated with colorectal cancer in ulcerative colitis.
Methods
We employed the intersection of the differential genes between UC and healthy controls, differential genes between UC dysplasia and UC, and the differential genes between UC dysplasia and healthy controls in GSE47908 to obtain overlapping genes and validated their accuracy in the TCGA dataset of COAD and GSE40967 to screen risk genes. The GSE110224/GSE113513 dataset of CODA, and the UC and COAD-related dataset GSE3629 were integrated for WGCNA analysis after normalizing the data. NOMO plot analysis was performed using the expression of overlapping genes of modular and risk genes in GSE47908 with UC dysplasia and UC.
Results
1576 overlapping genes were detected after screening for differential genes, which were validated in the TCGA and GSE datasets of colorectal cancer to construct a prognostic model. It was found that all P-values were less than 0.05 after survival analysis and less than 0.05 for progression-free survival, and the area under the risk score curve of the ROC curve was 0.894, which could be more accurate as a predictor of patient prognostic indicators. Then, WGCNA analysis was performed on UC, COAD and healthy controls to obtain five modular genes and intersected with overlapping genes to obtain 490 overlapping genes, and NOMO plotting by the LASSO algorithm to obtain seven key genes to predict the risk score of UC progression to COAD.
Conclusion
We screened seven gene indicators that could be used as key biomarkers of colorectal cancer susceptibility in patients with ulcerative colitis.