Background
It is difficult for radiologists, especially junior radiologists with limited experience to make differential diagnoses between mediastinal lymphomas and thymic epithelial tumors (TETs) due to the overlapping imaging features. The purpose of this study was to develop and validate a CT-based clinico-radiomics model for differentiating lymphomas from TETs and to investigate whether a human-machine hybrid system can assist junior radiologists in improving their diagnostic performance.
Methods
The patients who underwent contrast-enhanced chest CT and pathologically confirmed with lymphoma or TET at two centers from January 2011 to December 2019 and from January 2017 to December 2021 were retrospectively included and split as training/validation set and external test set, respectively. Clinical and radiomic signatures were pre-selected by elastic-net, and the models were established with the selected signatures using ensemble learning. Three radiologists independently reviewed CT images and assessed each case of the external test set with knowledge of the relevant clinical information. The diagnoses of reader 1, reader 2, and reader 3 were compared with those of the models in the external test set and further separately input to the model’s ensemble process as a human-machine system to make final decisions in the external test set. The improvement of diagnostic performance of radiologists by human-machine system was evaluated by the area under the receiver operating characteristic curve and increase rate.
Results
A total of 95 patients (51 with lymphomas and 44 with TETs) at Center 1 and 94 (52 with lymphomas and 42 with TETs) at Center 2 were enrolled and divided into training/validation sets and external test set, respectively. The diagnostic performance of the clinico-radiomics model has outperformed the junior radiologists and senior radiologist in AUC (clinico-radiomics model: 0.85 (0.76,0.92); reader 2: 0.70 (0.60,0.80); reader 3: 0.60 (0.49,0.71), reader 1: 0.76 (0.66,0.86), respectively) in the external test set. The human-machine hybrid system demonstrated significant increases in AUC (reader 1 + model: 0.87 (0.79,0.94), an increase of 14%; reader 2 + model: 0.86 (0.77,0.93), an increase of 23%; reader 3 + model: 0.84 (0.76,0.91), an increase of 40%), compared to the human performance alone.
Conclusions
The clinico-radiomics model outperformed three radiologists in differentiating lymphomas from TETs on CT. The use of the human-machine hybrid system significantly improved the performance of radiologists, especially junior radiologists. It provides a real-time decision tool to reduce bias and mistakes in radiologist diagnosis and enhances the diagnostic confidence of junior radiologists. This attempt may lead to more human-machine hybrid systems being explored in the diagnosis of different diseases to drive future clinical applications.