Objective: Nonalcoholic fatty liver disease (NAFLD) is a serious threat to human health worldwide. In this study, the aim is to analyze diagnosis biomarkers in NAFLD and its relationship with the immune microenvironment based on bioinformatics analysis.Methods: We downloaded microarray datasets (GSE48452 and GSE63067) from the Gene Expression Omnibus (GEO) database for screening differentially expressed genes (DEGs). The hub genes were screened by a series of machine learning analyses, such as support vector machine (SVM), least absolute shrinkage and selection operator (LASSO), and weighted gene co-expression network analysis (WGCNA). It is worth mentioning that we used the gene enrichment analysis to explore the driver pathways of NAFLD occurrence. Subsequently, the aforementioned genes were validated by external datasets (GSE66676). Moreover, the CIBERSORT algorithm was used to estimate the proportion of different types of immune cells. Finally, the Spearman analysis was used to verify the relationship between hub genes and immune cells.Results: Hub genes (CAMK1D, CENPV, and TRHDE) were identified. In addition, we found that the pathogenesis of NAFLD is mainly related to nutrient metabolism and the immune system. In correlation analysis, CENPV expression had a strong negative correlation with resting memory CD4 T cells, and TRHDE expression had a strong positive correlation with naive B cells.Conclusion: CAMK1D, CENPV, and TRHDE play regulatory roles in NAFLD. In particular, CENPV and TRHDE may regulate the immune microenvironment by mediating resting memory CD4 T cells and naive B cells, respectively, and thus influence disease progression.