Background To evaluate the diagnostic value of combinations of tumor markers carcinoembryonic antigen (CEA), carbohydrate antigen (CA) 125, CA153, and CA19-9 in identifying malignant pleural effusion (MPE) from non-malignant pleural effusion (non-MPE) using machine learning, and compare the performance of popular machine learning methods. Methods A total of 319 samples were collected from patients with pleural effusion in Beijing and Wuhan, China, from January 2018 to June 2020. Five machine learning methods including Logistic regression, extreme gradient boosting (XGBoost), Bayesian additive regression tree, random forest, and support vector machine were applied to evaluate the diagnostic performance. Sensitivity, specificity, Youden's index, and the area under the receiver operating characteristic curve (AUC) were used to evaluate the performance of different diagnostic models. Results For diagnostic models with a single tumor marker, the model using CEA, constructed by XGBoost, performed best (AUC = 0.895, sensitivity = 0.80), and the model with CA153, also by XGBoost, showed the largest specificity 0.98. Among all combinations of tumor markers, the combination of CEA and CA153 achieved the best performance (AUC = 0.921, sensitivity = 0.85) in identifying MPE under the diagnostic model constructed by XGBoost. Conclusions Diagnostic models for MPE with a combination of multiple tumor markers outperformed the models with a single tumor marker, particularly in sensitivity. Using machine learning methods, especially XGBoost, could comprehensively improve the diagnostic accuracy of MPE.