Objectives
This study aimed to clarify the performance of magnetic resonance imaging (MRI)-based deep learning classification models in diagnosing temporomandibular joint osteoarthritis (TMJ-OA) and to compare the developed diagnostic assistance with human observers.
Methods
The subjects were 118 patients who underwent MRI for examination of TMJ disorders. One hundred condyles with TMJ-OA and 100 condyles without TMJ-OA were enrolled. Deep learning was performed with four networks (ResNet18, EfficientNet b4, Inception v3, and GoogLeNet) using five-fold cross validation. Receiver operating characteristics (ROC) curves were drawn for each model and diagnostic metrics were determined. The performances of the four network models were compared using Kruskal-Wallis tests and post-hoc Scheffe tests, and ROCs between the best model and human were compared using chi-square tests, with p < 0.05 considered significant.
Results
ResNet18 had areas under the curves (AUCs) of 0.91–0.93 and accuracy of 0.85–0.88, which were the highest among the four networks. There were significant differences in AUC and accuracy between ResNet and GoogLeNet (p = 0.0264 and p = 0.0418, respectively). The kappa values of the models were large, 0.95 for ResNet and 0.93 for EfficientNet. The experts achieved similar AUC and accuracy values to the ResNet metrics, 0.94 and 0.85, and 0.84 and 0.84, respectively, but with a lower kappa of 0.67. Those of the dental residents showed lower values. There were significant differences in AUCs between ResNet and residents (p < 0.0001) and between experts and residents (p < 0.0001).
Conclusions
Using a deep learning model, high performance was confirmed for MRI diagnosis of TMJ-OA.