BackgroundThe non-invasive preoperative diagnosis of microvascular invasion (MVI) in hepatocellular carcinoma (HCC) is vital for precise surgical decision-making and patient prognosis. Herein, we aimed to develop an MVI prediction model with valid performance and clinical interpretability.MethodsA total of 2160 patients with HCC without macroscopic invasion who underwent hepatectomy for the first time in West China Hospital from January 2015 to June 2019 were retrospectively included, and randomly divided into training and a validation cohort at a ratio of 8:2. Preoperative demographic features, imaging characteristics, and laboratory indexes of the patients were collected. Five machine learning algorithms were used: logistic regression, random forest, support vector machine, extreme gradient boosting (XGBoost), and multilayer perception. Performance was evaluated using the area under the receiver operating characteristic curve (AUC). We also determined the Shapley Additive exPlanation value to explain the influence of each feature on the MVI prediction model.ResultsThe top six important preoperative factors associated with MVI were the maximum image diameter, protein induced by vitamin K absence or antagonist-II, α-fetoprotein level, satellite nodules, alanine aminotransferase (AST)/aspartate aminotransferase (ALT) ratio, and AST level, according to the XGBoost model. The XGBoost model for preoperative prediction of MVI exhibited a better AUC (0.8, 95% confidence interval: 0.74–0.83) than the other prediction models. Furthermore, to facilitate use of the model in clinical settings, we developed a user-friendly online calculator for MVI risk prediction based on the XGBoost model.ConclusionsThe XGBoost model achieved outstanding performance for non-invasive preoperative prediction of MVI based on big data. Moreover, the MVI risk calculator would assist clinicians in conveniently determining the optimal therapeutic remedy and ameliorating the prognosis of patients with HCC.