Background:
Although the surgical treatment strategy for rectal cancer (RC) is usually based on the preoperative diagnosis of lymph node metastasis (LNM), the accurate diagnosis of LNM has been a clinical challenge. In this study, we developed machine learning (ML) models to predict the LNM status before surgery based on a privacy-preserving computing platform (PPCP) and created a web tool to help clinicians with treatment-based decision-making in RC patients.
Patients and methods:
A total of 6578 RC patients were enrolled in this study. ML models, including logistic regression, support vector machine, extreme gradient boosting (XGB), and random forest, were used to establish the prediction models. The areas under the receiver operating characteristic curves (AUCs) were calculated to compare the accuracy of the ML models with the US guidelines and clinical diagnosis of LNM. Last, model establishment and validation were performed in the PPCP without the exchange of raw data among different institutions.
Results:
LNM was detected in 1006 (35.3%), 252 (35.3%), 581 (32.9%), and 342 (27.4%) RC patients in the training, test, and external validation sets 1 and 2, respectively. The XGB model identified the optimal model with an AUC of 0.84 [95% confidence interval (CI), 0.83–0.86] compared with the logistic regression model (AUC, 0.76; 95% CI, 0.74–0.78), random forest model (AUC, 0.82; 95% CI, 0.81–0.84), and support vector machine model (AUC, 0.79; 95% CI, 0.78–0.81). Furthermore, the XGB model showed higher accuracy than the predictive factors of the US guidelines and clinical diagnosis. The predictive XGB model was embedded in a web tool (named LN-MASTER) to predict the LNM status for RC.
Conclusion:
The proposed easy-to-use model showed good performance for LNM prediction, and the web tool can help clinicians make treatment-based decisions for patients with RC. Furthermore, PPCP enables state-of-the-art model development despite the limited local data availability.