Objective
Cholesteatoma and otitis media are two of the most common middle ear diseases, of which the treatment principles are different, making the differentiation between them of significant importance. Both chronic suppurative otitis media (CSOM) and middle ear cholesteatoma (MEC) can appear on CT images as low-density soft tissue-like masses partially filling the middle ear and mastoid cavities. However, typical CT imaging of MEC may show progressive destruction of auditory structures and adjacent cranial bones. Compared to high-resolution CT (HRCT), ultra-high-resolution CT (U-HRCT) offers inherent continuity and a more detailed display of the fine structures of the middle ear. This study proposes a "cloud-edge" collaborative training framework for middle ear disease classification that exploits temporal bone U-HRCT imaging data. By integrating the YOLO recognition algorithm, this framework aims to achieve auxiliary classification of MEC and CSOM based on U-HRCT images.
Design:
In the cloud-edge collaborative framework, the edge devices acquire U-HRCT imaging data and perform auxiliary classification of middle ear diseases using image recognition and inference techniques. The imaging data collected by the edge devices are transmitted to the cloud, where a unified model training process is executed, and the model containers are then deployed to the edge devices for future auxiliary diagnosis. The framework employed Mixup and Mosaic methods for data augmentation to enhance model robustness and improve generalization performance. The object detection models of the You Only Look Once (YOLO) family was used, and the final model selection was made based on their performance.
Results
This study found that this cloud-edge collaborative framework can effectively classify temporal bone U-HRCT imaging data for MEC and CSOM. In the test set, the framework successfully collected real CT image data, performed data processing and conducted model training as designed. Eventually, multiple models were trained, with different levels of detection ability assessed by selected metrics, allowing for trade-offs in model selection considering computation time and accuracy. The selected model was then deployed to the edge, where they performed auxiliary classification tasks at the edge device.
Conclusions
This study discussed the significance of temporal bone U-HRCT imaging in the diagnosis of CSOM and MEC and proposed a cloud-edge collaborative model training framework for auxiliary classification from U-HRCT imaging data. This approach maximizes the utility of the data, fully leverages the diversity of image recognition algorithms, and ensures a high level of accuracy in classification.