Background
The heterogenous cytogenetic and molecular variations were harbored by AML patients, some of which are related with AML pathogenesis and clinical outcomes. We aimed to uncover the intrinsic expression profiles correlating with prognostic genetic abnormalities by WGCNA.
Methods
We downloaded the clinical and expression dataset from BeatAML, TCGA and GEO database. Using R (version 4.0.2) and ‘WGCNA’ package, the co-expression modules correlating with the ELN2017 prognostic markers were identified (R2 ≥ 0.4, p < 0.01). ORA detected the enriched pathways for the key co-expression modules. The patients in TCGA cohort were randomly assigned into the training set (50%) and testing set (50%). The LASSO penalized regression analysis was employed to build the prediction model, fitting OS to the expression level of hub genes by ‘glmnet’ package. Then the testing and 2 independent validation sets (GSE12417 and GSE37642) were used to validate the diagnostic utility and accuracy of the model.
Results
A total of 37 gene co-expression modules and 973 hub genes were identified for the BeatAML cohort. We found that 3 modules were significantly correlated with genetic markers (the ‘lightyellow’ module for NPM1 mutation, the ‘saddlebrown’ module for RUNX1 mutation, the ‘lightgreen’ module for TP53 mutation). ORA revealed that the ‘lightyellow’ module was mainly enriched in DNA-binding transcription factor activity and activation of HOX genes. The ‘saddlebrown’ module was enriched in immune response process. And the ‘lightgreen’ module was predominantly enriched in mitosis cell cycle process. The LASSO- regression analysis identified 6 genes (NFKB2, NEK9, HOXA7, APRC5L, FAM30A and LOC105371592) with non-zero coefficients. The risk score generated from the 6-gene model, was associated with ELN2017 risk stratification, relapsed disease, and prior MDS history. The 5-year AUC for the model was 0.822 and 0.824 in the training and testing sets, respectively. Moreover, the diagnostic utility of the model was robust when it was employed in 2 validation sets (5-year AUC 0.743–0.79).
Conclusions
We established the co-expression network signature correlated with the ELN2017 recommended prognostic genetic abnormalities in AML. The 6-gene prediction model for AML survival was developed and validated by multiple datasets.