Objective
The purpose of this study was to develop an individual survival prediction model based on multiple machine learning (ML) algorithms to predict survival probability for remnant gastric cancer (RGC).
Methods
Clinicopathologic data of 286 patients with RGC undergoing operation (radical resection and palliative resection) from a multi-institution database were enrolled and analyzed retrospectively. These individuals were split into training (80%) and test cohort (20%) by using random allocation. Nine commonly used ML methods were employed to construct survival prediction models. Algorithm performance was estimated by analyzing accuracy, precision, recall, F1-score, area under the receiver operating characteristic curve (AUC), confusion matrices, five-fold cross-validation, decision curve analysis (DCA), and calibration curve. The best model was selected through appropriate verification and validation and was suitably explained by the SHapley Additive exPlanations (SHAP) approach.
Results
Compared with the traditional methods, the RGC survival prediction models employing ML exhibited good performance. Except for the decision tree model, all other models performed well, with a mean ROC AUC above 0.7. The DCA findings suggest that the developed models have the potential to enhance clinical decision-making processes, thereby improving patient outcomes. The calibration curve reveals that all models except the decision tree model displayed commendable predictive performance. Through CatBoost-based modeling and SHAP analysis, the five-year survival probability is significantly influenced by several factors: the lymph node ratio (LNR), T stage, tumor size, resection margins, perineural invasion, and distant metastasis.
Conclusions
This study established predictive models for survival probability at five years in RGC patients based on ML algorithms which showed high accuracy and applicative value.