BACKGROUND
Complete response after neoadjuvant chemotherapy (rNACT) elevates the surgical outcomes of patients with breast cancer, however, non-rNACT have a higher risk of death and recurrence.
AIM
To establish novel machine learning (ML)-based predictive models for predicting probability of rNACT in breast cancer patients who intends to receive NACT.
METHODS
A retrospective analysis of 487 breast cancer patients who underwent mastectomy or breast-conserving surgery and axillary lymph node dissection following neoadjuvant chemotherapy at the Hubei Cancer Hospital between January 1, 2013, and October 1, 2021. The study cohort was divided into internal training and testing datasets in a 70:30 ratio for further analysis. A total of twenty-four variables were included to develop predictive models for rNACT by multiple ML-based algorithms. A feature selection approach was used to identify optimal predictive factors. These models were evaluated by the receiver operating characteristic (ROC) curve for predictive performance.
RESULTS
Analysis identified several significant differences between the rNACT and non-rNACT groups, including total cholesterol, low-density lipoprotein, neutrophil-to-lymphocyte ratio, body mass index, platelet count, albumin-to-globulin ratio, platelet-to-lymphocyte ratio, and lymphocyte-to-monocyte ratio. The areas under the curve of the six models ranged from 0.81 to 0.96. Some ML-based models performed better than models using conventional statistical methods in both ROC curves. The support vector machine (SVM) model with twelve variables introduced was identified as the best predictive model.
CONCLUSION
By incorporating pretreatment serum lipids and serum inflammation markers, it is feasible to develop ML-based models for the preoperative prediction of rNACT and therefore facilitate the choice of treatment, particularly the SVM, which can improve the prediction of rNACT in patients with breast cancer.