BackgroundThe incidence of hepatocellular carcinoma (HCC) is rising worldwide, and there is limited therapeutic efficacy due to tumor microenvironment heterogeneity and difficulty in early-stage screening. This study aimed to develop and validate a gene set-based signature for early-stage HCC (eHCC) patients and further explored specific marker dysregulation mechanisms as well as immune characteristics.MethodsWe performed an integrated bioinformatics analysis of genomic, transcriptomic, and clinical data with three independent cohorts. We systematically reviewed the crosstalk between specific genes, tumor prognosis, immune characteristics, and biological function in the different pathological stage samples. Univariate and multivariate survival analyses were performed in The Cancer Genome Atlas (TCGA) patients with survival data. Diethylnitrosamine (DEN)-induced HCC in Wistar rats was employed to verify the reliability of the predictions.ResultsWe identified a Cluster gene that potentially segregates patients with eHCC from non-tumor, through integrated analysis of expression, overall survival, immune cell characteristics, and biology function landscapes. Immune infiltration analysis showed that lower infiltration of specific immune cells may be responsible for significantly worse prognosis in HCC (hazard ratio, 1.691; 95% CI: 1.171–2.441; p = 0.012), such as CD8 Tem and cytotoxic T cells (CTLs) in eHCC. Our results identified that Cluster C1 signature presented a high accuracy in predicting CD8 Tem and CTL immune cells (receiver operating characteristic (ROC) = 0.647) and cancerization (ROC = 0.946) in liver. As a central member of Cluster C1, overexpressed PRKDC was associated with the higher genetic alteration in eHCC than advanced-stage HCC (aHCC), which was also connected to immune cell-related poor prognosis. Finally, the predictive outcome of Cluster C1 and PRKDC alteration in DEN-induced eHCC rats was also confirmed.ConclusionsAs a tumor prognosis-relevant gene set-based signature, Cluster C1 showed an effective approach to predict cancerization of eHCC and its related immune characteristics with considerable clinical value.