Purpose
To explore the application of deep learning (DL) methods based on T2 sagittal MR images for discriminating between spinal tuberculosis (STB) and spinal metastases (SM).
Patients and Methods
A total of 121 patients with histologically confirmed STB and SM across four institutions were retrospectively analyzed. Data from two institutions were used for developing deep learning models and internal validation, while the remaining institutions’ data were used for external testing. Utilizing MVITV2, EfficientNet-B3, ResNet101, and ResNet34 as backbone networks, we developed four distinct DL models and evaluated their diagnostic performance based on metrics such as accuracy (ACC), area under the receiver operating characteristic curve (AUC), F1 score, and confusion matrix. Furthermore, the external test images were blindly evaluated by two spine surgeons with different levels of experience. We also used Gradient-Class Activation Maps to visualize the high-dimensional features of different DL models.
Results
For the internal validation set, MVITV2 outperformed other models with an accuracy of 98.7%, F1 score of 98.6%, and AUC of 0.98. Other models followed in this order: EfficientNet-B3 (ACC: 96.1%, F1 score: 95.9%, AUC: 0.99), ResNet101 (ACC: 85.5%, F1 score: 84.8%, AUC: 0.90), and ResNet34 (ACC: 81.6%, F1 score: 80.7%, AUC: 0.85). For the external test set, MVITV2 again performed excellently with an accuracy of 91.9%, F1 score of 91.5%, and an AUC of 0.95. EfficientNet-B3 came second (ACC: 85.9, F1 score: 91.5%, AUC: 0.91), followed by ResNet101 (ACC:80.8, F1 score: 80.0%, AUC: 0.87) and ResNet34 (ACC: 78.8, F1 score: 77.9%, AUC: 0.86). Additionally, the diagnostic accuracy of the less experienced spine surgeon was 73.7%, while that of the more experienced surgeon was 88.9%.
Conclusion
Deep learning based on T2WI sagittal images can help discriminate between STB and SM, and can achieve a level of diagnostic performance comparable with that produced by experienced spine surgeons.