Background
For knee osteoarthritis, the commonly used radiology severity criteria Kellgren–Lawrence lead to variability among surgeons. Most existing diagnosis models require preprocessed radiographs and specific equipment.
Methods
All enrolled patients diagnosed with KOA who met the criteria were obtained from **** Hospital. This study included 2579 images shot from posterior–anterior X-rays of 2,378 patients. We used RefineDet to train and validate this deep learning-based diagnostic model. After developing the model, 823 images of 697 patients were enrolled as the test set. The whole test set was assessed by up to 5 surgeons and this diagnostic model. To evaluate the model’s performance we compared the results of the model with the KOA severity diagnoses of surgeons based on K-L scales.
Results
Compared to the diagnoses of surgeons, the model achieved an overall accuracy of 0.977. Its sensitivity (recall) for K-L 0 to 4 was 1.0, 0.972, 0.979, 0.983 and 0.989, respectively; for these diagnoses, the specificity of this model was 0.992, 0.997, 0.994, 0.991 and 0.995. The precision and F1-score were 0.5 and 0.667 for K-L 0, 0.914 and 0.930 for K-L 1, 0.978 and 0.971 for K-L 2, 0.981 and 0.974 for K-L 3, and 0.988 and 0.985 for K-L 4, respectively. All K-L scales perform AUC > 0.90. The quadratic weighted Kappa coefficient between the diagnostic model and surgeons was 0.815 (P < 0.01, 95% CI 0.727–0.903). The performance of the model is comparable to the clinical diagnosis of KOA. This model improved the efficiency and avoided cumbersome image preprocessing.
Conclusion
The deep learning-based diagnostic model can be used to assess the severity of KOA in portable devices according to the Kellgren–Lawrence scale. On the premise of improving diagnostic efficiency, the results are highly reliable and reproducible.