Venous thromboembolism (VTE) is characterized by a high recurrence rate and adverse consequences, including high mortality. Damage to vascular endothelial cells (VECs) serves a key role in VTE and lactate (LA) metabolism is associated with VEC damage. However, the pathogenesis of VTE and the role of lactate metabolism-related molecules (LMRMs) remain unclear. Based on the GSE48000 dataset, the present study identified differentially expressed (DE-)LMRMs between healthy individuals and those with VTE. Thereafter, LMRMs were used to establish four machine learning models, namely, the random forest, support vector machine and generalized linear model (GLM) and eXtreme gradient boosting. To verify disease prediction efficiency of the models, nomograms, calibration curves, decision curve analyses and external datasets were used. The optimal machine learning model was used to predict genes involved in disease and an
in vitro
oxygen-glucose deprivation (OGD) model was used to detect the survival rate, LA levels and LMRM expression levels of VECs. A total of four DE-LMRMs, solute carrier family 16 member 1 (SLC16A1), SLC16A7, SLC16A8 and SLC5A12 were obtained and GLM was identified as the best performing model based on its ability to predict differential expression of the embigin, lactate dehydrogenase B, SLC16A1, SLC5A12 and SLC16A8 genes. Additionally, SLC16A1, SLC16A7 and SLC16A8 served key roles in VTE and the OGD model demonstrated a significant decrease in VEC survival rate as well as a significant increase and decrease in intracellular LA and SLC16A1 expression levels in VECs, respectively. Thus, LMRMs may be involved in VTE pathogenesis and be used to build accurate VTE prediction models. Further, it was hypothesized that the observed increase in intracellular LA levels in VECS was associated with the decrease in SLC16A1 expression. Therefore, SLC16A1 expression may be an essential target for VTE treatment.