The aim of the present study was to identify key genes in colorectal cancer (CRC) that could be used to reliably diagnose this disease and to explore the potential underlying mechanisms in silico. The gene expression profiles of primary human cancer datasets GSE21510 and GSE32323 were downloaded from the Gene Expression Omnibus database. The limma R software package was used to identify differentially expressed (DE) genes. Gene Ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed on DE genes using the Database for Annotation, Visualization and Integrated Discovery. The Search Tool for the Retrieval of Interacting Genes/Proteins database was used to construct a protein-protein interaction (PPI) network of the DE genes. Survival rate was analyzed and visualized using The Cancer Genome Atlas (TCGA). A total of 1,126 genes were significantly DE in the present study. All DE genes were enriched in KEGG pathways including 'cell cycle', 'mineral absorption', 'pancreatic secretion', 'pathways in cancer', 'metabolic pathways', 'aldosterone-regulated sodium reabsorption' and 'Wnt signaling pathway'. A total of 5 hub genes enriched in cell cycle and tumor-associated pathways, including E2F2, SKP2, MYC, CDKN1A and CDKN2B, were significantly DE and validated between tumor and normal tissues. CDKN1A and CDKN2B were identified within the PPI network using the Molecular Complex Detection algorithm. Survival and content distribution analyses of 362 clinical samples from TCGA revealed that CDKN1A effectively predicted the prognosis of patients. The present study identified key genes and potential signaling pathways involved in CRC. These findings may provide new insights for survival assessment during the clinical diagnosis of CRC.