Traditional prognostic studies utilized different cut-off values, without evaluating potential information contained in inflammation-related hematological indicators. Using the interpretable machine-learning algorithm RuleFit, this study aimed to explore valuable inflammatory rules reflecting prognosis in nasopharyngeal carcinoma (NPC) patients. Patients and Methods: In total, 1706 biopsy-proven NPC patients treated in two independent hospitals (1320 and 386) between January 2010 and March 2014 were included. RuleFit was used to develop risk-predictive rules using hematological indicators with no distributive difference between the two centers. Time-event-dependent hematological rules were further selected by stepwise multivariate Cox analysis. Combining high-efficiency hematological rules and clinical predictors, a final model was established. Models based on other algorithms (AutoML, Lasso) and clinical predictors were built for comparison, as well as a reported nomogram. Area under the receiver operating characteristic curve (AUROC) and concordance index (C-index) were used to verify the predictive precision of different models. A site-based app was established for convenience. Results: RuleFit identified 22 combined baseline hematological rules, achieving AUROCs of 0.69 and 0.64 in the training and validation cohorts, respectively. By contrast, the AUROCs of the optimal contrast model based on AutoML were 1.00 and 0.58. For overall survival, the final model had a much higher C-index than the base model using TN staging in two cohorts (0.769 vs 0.717, P<0.001; 0.752 vs 0.688, P<0.001), and showing great generalizability in training and validation cohorts. The two models based on RuleFit rules performed best, compared with other models. As for other endpoints, the final model showed a similar trend. Kaplan-Meier curve exhibited 22.9% (390/1706) patients were "misclassified" by AJCC staging, but the final model could assess risk classification accurately.
Conclusion:The proposed final models based on inflammation-related rules based on RuleFit showed significantly elevated predictive performance.