Background
Lung adenocarcinoma (LUAD) is a group of cancers with poor prognosis. The combination of single-cell RNA sequencing (scRNA-seq) and bulk RNA sequencing (RNA-seq) can identify important genes involved in cancer development and progression from a broader perspective.
Methods
The scRNA-seq data and bulk RNA-seq data of LUAD were downloaded from the Gene Expression Omnibus (GEO) database and the Cancer Genome Atlas (TCGA) database. Analyzing scRNA-seq for core cells in the GSE131907 dataset, and the uniform manifold approximation and projection (UMAP) was used for dimensionality reduction and cluster identification. Macrophage polarization-associated subtypes were acquired from the TCGA-LUAD dataset after analysis, followed by further identification of differentially expressed genes (DEGs) in the TCGA-LUAD dataset (normal/LUAD tissue samples, two subtypes). Venn diagrams were utilized to visualize differentially expressed and highly variable macrophage polarization-related genes. Subsequently, a prognostic risk model for LUAD patients was constructed by univariate Cox and Least Absolute Shrinkage and Selection Operator (LASSO), and the model was investigated for stability in the external data GSE72094. After analyzing the correlation between the trait genes and significantly mutated genes, the immune infiltration between the high/low-risk groups was then examined. The Monocle package was applied to analyze the pseudo-temporal trajectory analysis of different cell clusters in macrophage clusters. Subsequently, cell clusters of data macrophages were selected as key cell clusters to explore the role of characteristic genes in different cell populations and to identify transcription factors (TFs) that affect signature genes. Finally, qPCR were employed to validate the expression levels of prognosis signature genes in LUAD.
Results
424 macrophage highly variable genes, 3920 DEGs, and 9561 DEGs were obtained from macrophage clusters, the macrophage polarization-related subtypes, and normal/LUAD tissue samples, respectively. Twenty-eight differentially expressed and highly mutated MPRGs were obtained. A prognostic risk model with 7 DE-MPRGs (RGS13, ADRB2, DDIT4, MS4A2, ALDH2, CTSH, and PKM) was constructed. This prognostic model still has a good prediction effect in the GSE72094 dataset. ZNF536 and DNAH9 were mutated in the low-risk group, while COL11A1 was mutated in the high-risk group, and they were highly correlated with the characteristic genes. A total of 11 immune cells were significantly different in the high/low-risk groups. Five cell types were again identified in the macrophage cluster, and then NK cells: CD56hiCD62L+ differentiated earlier and were present mainly on 2 branches. While macrophages were present on 2 branches and differentiated later. It was found that the expression levels of BCLAF1 and MAX were higher in cluster 1, which might be the TFs affecting the expression of the characteristic genes. Moreover, qPCR confi...