Background
An in-depth understanding of the key molecules and associated mechanisms involved in acute myeloid leukemia (AML) carcinogenesis, proliferation, and relapse is critical. This provides a basis for disease screening, early diagnosis, and development of effective treatment strategies and prognosis.
Methods
We downloaded AML transcription data sets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases. Differentially expressed genes (DEGs) were screened by R software and limma packages. Gene Ontology (GO) functional enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were performed on DEGs by public databases. In the DEG set, a random forest algorithm was used to identify characteristic genes of AML. The receiver operator characteristic (ROC) curve was used to evaluate the diagnostic efficacy of selected characteristic genes, which provided clues for the discovery of early diagnostic markers. The Estimate score was calculated using the Estimation of STromal and Immune cells in MAlignant Tumor tissues using Expression data (ESTIMATE) algorithm. Spearman’s correlation test was used to explore the correlation between characteristic genes and Estimate Score, which provided clues for clarifying the potential pathogenic mechanism of key genes.
Results
A total of 1,494 DEGs were identified from AML samples and normal samples, among which 1,181 genes were upregulated and 313 genes were downregulated in AML. There were 2 genes with a mean decrease Gini >2, namely,
CDC20
and
ESM1
, respectively. The ROC curve showed that the area under the curve (AUC) of
CDC20
was 0.966, and the 95% confidence interval (CI) was (0.939 to 0.987) (P<0.001). The AUC of
ESM1
was 0.905, and 95% CI: 0.849 to 0.953 (P<0.001). Correlation analysis showed that
CDC20
expression was negatively correlated with Estimate Score (R=−0.21, P=0.0036) in AML. The expression of
ESM1
was negatively correlated with Estimate Score (R=−0.57, P<0.001).
Conclusions
The genes
CDC20
and
ESM1
were identified as AML characteristic genes by random forest algorithm. Both
CDC20
and
ESM1
have good diagnostic efficacy for AML. They may play a carcinogenic role by promoting tumor cell proliferation and inhibiting immune cell chemotaxis, which are potential biological markers.