OBJECTIVE
Chordomas are rare tumors from notochordal remnants and account for 1%–4% of all primary bone malignancies, often arising from the clivus and sacrum. Despite margin-negative resection and postoperative radiotherapy, chordomas often recur. Further, immunohistochemical (IHC) markers have not been assessed as predictive of chordoma recurrence. The authors aimed to identify the IHC markers that are predictive of postoperative long-term (≥ 1 year) chordoma recurrence by using trained multiple tree-based machine learning (ML) algorithms.
METHODS
The authors reviewed the records of patients who had undergone treatment for clival and spinal chordomas between January 2017 and June 2021 across the Mayo Clinic enterprise (Minnesota, Florida, and Arizona). Demographics, type of treatment, histopathology, and other relevant clinical factors were abstracted from each patient record. Decision tree and random forest classifiers were trained and tested to predict long-term recurrence based on unseen data using an 80/20 split.
RESULTS
One hundred fifty-one patients diagnosed and treated for chordomas were identified: 58 chordomas of the clivus, 48 chordomas of the mobile spine, and 45 chordomas sacrococcygeal in origin. Patients diagnosed with cervical chordomas were the oldest among all groups (58 ± 14 years, p = 0.009). Most patients were male (n = 91, 60.3%) and White (n = 139, 92.1%). Most patients underwent resection with or without radiation therapy (n = 129, 85.4%). Subtotal resection followed by radiation therapy (n = 51, 33.8%) was the most common treatment modality, followed by gross-total resection then radiation therapy (n = 43, 28.5%). Multivariate analysis showed that S100 and pan-cytokeratin are more likely to predict the increase in the risk of postoperative recurrence (OR 3.67, 95% CI 1.09–12.42, p= 0.03; and OR 3.74, 95% CI 0.05–2.21, p = 0.02, respectively). In the decision tree analysis, a clinical follow-up > 1897 days was found in 37% of encounters and a 90% chance of being classified for recurrence (accuracy = 77%). Random forest analysis (n = 500 trees) showed that patient age, type of surgical treatment, location of tumor, S100, pan-cytokeratin, and EMA are the factors predicting long-term recurrence.
CONCLUSIONS
The IHC and clinicopathological variables combined with tree-based ML tools successfully demonstrated a high capacity to identify recurrence patterns with an accuracy of 77%. S100, pan-cytokeratin, and EMA were the IHC drivers of recurrence. This shows the power of ML algorithms in analyzing and predicting outcomes of rare conditions of a small sample size.