BackgroundThe main challenge in diagnosing and treating ulcerative colitis (UC) has prompted this study to discover useful biomarkers and understand the underlying molecular mechanisms.MethodsIn this study, transcriptomic data from intestinal mucosal biopsies underwent Robust Rank Aggregation (RRA) analysis to identify differential genes. These genes intersected with UC key genes from Weighted Gene Co-expression Network Analysis (WGCNA). Machine learning identified UC signature genes, aiding predictive model development. Validation involved external data for diagnostic, progression, and drug efficacy assessment, along with ELISA testing of clinical serum samples.ResultsRRA integrative analysis identified 251 up-regulated and 211 down-regulated DEGs intersecting with key UC genes in WGCNA, yielding 212 key DEGs. Subsequently, five UC signature biomarkers were identified by machine learning based on the key DEGs—THY1, SLC6A14, ECSCR, FAP, and GPR109B. A logistic regression model incorporating these five genes was constructed. The AUC values for the model set and internal validation data were 0.995 and 0.959, respectively. Mechanistically, activation of the IL-17 signaling pathway, TNF signaling pathway, PI3K-Akt signaling pathway in UC was indicated by KEGG and GSVA analyses, which were positively correlated with the signature biomarkers. Additionally, the expression of the signature biomarkers was strongly correlated with various UC types and drug efficacy in different datasets. Notably, ECSCR was found to be upregulated in UC serum and exhibited a positive correlation with neutrophil levels in UC patients.ConclusionsTHY1, SLC6A14, ECSCR, FAP, and GPR109B can serve as potential biomarkers of UC and are closely related to signaling pathways associated with UC progression. The discovery of these markers provides valuable information for understanding the molecular mechanisms of UC.