BackgroundPathological axillary lymph node (pALN) burden is an important factor for treatment decision‐making in clinical T1‐T2 (cT1‐T2) stage breast cancer. Preoperative assessment of the pALN burden and prognosis aids in the individualized selection of therapeutic approaches.PurposeTo develop and validate a machine learning (ML) model based on clinicopathological and MRI characteristics for assessing pALN burden and survival in patients with cT1‐T2 stage breast cancer.Study TypeRetrospective.PopulationA total of 506 females (range: 24–83 years) with cT1‐T2 stage breast cancer from two institutions, forming the training (N = 340), internal validation (N = 85), and external validation cohorts (N = 81), respectively.Field Strength/SequenceThis study used 1.5‐T, axial fat‐suppressed T2‐weighted turbo spin‐echo sequence and axial three‐dimensional dynamic contrast‐enhanced fat‐suppressed T1‐weighted gradient echo sequence.AssessmentFour ML methods (eXtreme Gradient Boosting [XGBoost], Support Vector Machine, k‐Nearest Neighbor, Classification and Regression Tree) were employed to develop models based on clinicopathological and MRI characteristics. The performance of these models was evaluated by their discriminative ability. The best‐performing model was further analyzed to establish interpretability and used to calculate the pALN score. The relationships between the pALN score and disease‐free survival (DFS) were examined.Statistical TestsChi‐squared test, Fisher's exact test, univariable logistic regression, area under the curve (AUC), Delong test, net reclassification improvement, integrated discrimination improvement, Hosmer‐Lemeshow test, log‐rank, Cox regression analyses, and intraclass correlation coefficient were performed. A P‐value <0.05 was considered statistically significant.ResultsThe XGB II model, developed based on the XGBoost algorithm, outperformed the other models with AUCs of 0.805, 0.803, and 0.818 in the three cohorts. The Shapley additive explanation plot indicated that the top variable in the XGB II model was the Node Reporting and Data System score. In multivariable Cox regression analysis, the pALN score was significantly associated with DFS (hazard ratio: 4.013, 95% confidence interval: 1.059–15.207).Data ConclusionThe XGB II model may allow to evaluate pALN burden and could provide prognostic information in cT1‐T2 stage breast cancer patients.Level of Evidence3Technical EfficacyStage 2