Background: Osteosarcoma is well-established as the most common bone cancer in children and adolescents. Patients with localized disease have different prognoses and management than those with metastasis at the time of diagnosis. The purpose of this study was to explore potential risk factors for metastatic disease.
Methods:The Surveillance, Epidemiology, and End Results (SEER) Program database was used to identify patients diagnosed with osteosarcoma between 2004 and 2015. We developed prediction models for distant metastasis using six machine learning (ML) techniques, including logistic regression (LR), support vector machine (SVM), Gaussian Naive Bayes (GaussianNB), Extreme Gradient Boosting (XGBoost), random forest (RF), and k-nearest neighbor algorithm (kNN). The adaptive synthetic (ADASYN) technique was used to deal with imbalanced data. The Shapley Additive Explanation (SHAP) analysis generated visualized explanations for each patient. Finally, the average precision (AP), sensitivity, specificity, accuracy, F1 score, precision-recall curves, calibration plots, and decision curve analysis (DCA) were conducted to evaluate the models' effectiveness.
Results:The six machine learning algorithms achieved AP of 0.661-0.781 for predicting distant metastasis. The RF model yielded the best performance with an accuracy of 71.8 percent and an AP of 0.781 and was highly dependent on tumor size, primary surgery, and age. SHAP analysis provided model-independent interpretation, highlighting significant clinical factors associated with the risk of metastasis in osteosarcoma patients.Conclusions: An accurate machine learning-based prediction model was established for metastasis in osteosarcoma patients to help clinicians during clinical decision-making.