To evaluate the performance of a deep learning-based algorithm for automatic detection and labeling of rib fractures from multicenter chest CT images. Materials and Methods: This retrospective study included 10 943 patients (mean age, 55 years; 6418 men) from six hospitals (January 1, 2017 to December 30, 2019), which consisted of patients with and without rib fractures who underwent CT. The patients were separated into one training set (n = 2425), two lesion-level test sets (n = 362 and 105), and one examination-level test set (n = 8051).Free-response receiver operating characteristic (FROC) score (mean sensitivity of seven different false-positive rates), precision, sensitivity, and F1 score were used as metrics to assess rib fracture detection performance. Area under the receiver operating characteristic curve (AUC), sensitivity, and specificity were employed to evaluate the classification accuracy. The mean Dice coefficient and accuracy were used to assess the performance of rib labeling.
Results:In the detection of rib fractures, the model showed an FROC score of 84.3% on test set 1.