Transformer-based structuring of free-text radiology report databases

Nowak, S.; Biesner, David; Layer, Yannik C.; Theis, Mirko; Schneider, Helen; Block, Wolfgang; Wulff, Benjamin; Attenberger, Ulrike I.; Sifa, Rafet; Sprinkart, Alois M.

doi:10.1007/s00330-023-09526-y

Cited by 9 publications

(11 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Similar to Nowak et al [11], the deep learning-based model outperformed the rule-based model on German reports. Apart from using different data sets and labels, a direct comparison is not conclusive, as their model was trained differently, and they considered both uncertain and negative mentions as negative labels.…”

Section: Chestsupporting

confidence: 68%

“…For example, the mean mention extraction F1 score improved from 84 % to 94 % when using all data. Furthermore, our model was trained on approximately 1000 manually labeled reports, compared to the total of 14580 used for development by Nowak et al [11]. They showed that increasing the amount of manually annotated training data improved mean F1 scores from 70.9 % to 95.5 % when increasing training data from 500 to 14580 samples.…”

Section: Chestmentioning

confidence: 99%

“…Radiologists are in short supply worldwide [1,2,3,4], for example, due to an aging population [5], and deep learning models hold promise for addressing this shortage, for example, as part of clinical decision-support systems [6,7]. However, training such models often requires large data sets [8,9] that are expensive and time-consuming to manually label [10,11]. To reduce the amount of time for obtaining labeled data sets, automatic label extraction from radiology reports is a compelling option.…”

Section: Introductionmentioning

confidence: 99%

“…Smit et al improved upon their rule-based labeler for English radiology reports by using a BERT [15] language model as a backbone [20]. Similarly, Nowak et al investigated the use of BERT for German radiology reports [11]. They compared a rule-based labeler to a deep learning model, trained with 18000 manually annotated reports, rule-based extracted labels, and a combination of both.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Language model-based labeling of German thoracic radiology reports

Wollek,

Haitzer,

Sedlmeyr

et al. 2024

Rofo

View full text Add to dashboard Cite

The aim of this study was to explore the potential of weak supervision in a deep learning-based label prediction model. The goal was to use this model to extract labels from German free-text thoracic radiology reports on chest X-ray images and for training chest X-ray classification models.The proposed label extraction model for German thoracic radiology reports uses a German BERT encoder as a backbone and classifies a report based on the CheXpert labels. For investigating the efficient use of manually annotated data, the model was trained using manual annotations, weak rule-based labels, and both. Rule-based labels were extracted from 66071 retrospectively collected radiology reports from 2017–2021 (DS 0), and 1091 reports from 2020–2021 (DS 1) were manually labeled according to the CheXpert classes. Label extraction performance was evaluated with respect to mention extraction, negation detection, and uncertainty detection by measuring F1 scores. The influence of the label extraction method on chest X-ray classification was evaluated on a pneumothorax data set (DS 2) containing 6434 chest radiographs with associated reports and expert diagnoses of pneumothorax. For this, DenseNet-121 models trained on manual annotations, rule-based and deep learning-based label predictions, and publicly available data were compared.The proposed deep learning-based labeler (DL) performed on average considerably stronger than the rule-based labeler (RB) for all three tasks on DS 1 with F1 scores of 0.938 vs. 0.844 for mention extraction, 0.891 vs. 0.821 for negation detection, and 0.624 vs. 0.518 for uncertainty detection. Pre-training on DS 0 and fine-tuning on DS 1 performed better than only training on either DS 0 or DS 1. Chest X-ray pneumothorax classification results (DS 2) were highest when trained with DL labels with an area under the receiver operating curve (AUC) of 0.939 compared to RB labels with an AUC of 0.858. Training with manual labels performed slightly worse than training with DL labels with an AUC of 0.934. In contrast, training with a public data set resulted in an AUC of 0.720.Our results show that leveraging a rule-based report labeler for weak supervision leads to improved labeling performance. The pneumothorax classification results demonstrate that our proposed deep learning-based labeler can serve as a substitute for manual labeling requiring only 1000 manually annotated reports for training.Wollek A, Haitzer P, Sedlmeyr T et al. Language modelbased labeling of German thoracic radiology reports. Fortschr Röntgenstr 2024; DOI 10.1055/a-2287-5054

show abstract

Section: Chestsupporting

confidence: 68%

Section: Chestmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Language model-based labeling of German thoracic radiology reports

Wollek,

Haitzer,

Sedlmeyr

et al. 2024

Rofo

View full text Add to dashboard Cite

show abstract

“…For German radiology reports, Nowak et al investigated different approaches for training a deep-learning based labeling model [20]. In contrast to the CheXpert labeler, their model predicted only six observations: pulmonary infiltrates, pleural effusion, pulmonary congestion, pneumothorax, regular position of the central venous catheter (CVC) and misplaced position of the CVC.…”

Section: Introductionmentioning

confidence: 99%

German CheXpert Chest X-ray Radiology Report Labeler

Wollek,

Hyska,

Sedlmeyr

et al. 2024

Rofo

View full text Add to dashboard Cite

Purpose The aim of this study was to develop an algorithm to automatically extract annotations from German thoracic radiology reports to train deep learning-based chest X-ray classification models. Materials and Methods An automatic label extraction model for German thoracic radiology reports was designed based on the CheXpert architecture. The algorithm can extract labels for twelve common chest pathologies, the presence of support devices, and “no finding”. For iterative improvements and to generate a ground truth, a web-based multi-reader annotation interface was created. With the proposed annotation interface, a radiologist annotated 1086 retrospectively collected radiology reports from 2020–2021 (data set 1). The effect of automatically extracted labels on chest radiograph classification performance was evaluated on an additional, in-house pneumothorax data set (data set 2), containing 6434 chest radiographs with corresponding reports, by comparing a DenseNet-121 model trained on extracted labels from the associated reports, image-based pneumothorax labels, and publicly available data, respectively. Results Comparing automated to manual labeling on data set 1: “mention extraction” class-wise F1 scores ranged from 0.8 to 0.995, the “negation detection” F1 scores from 0.624 to 0.981, and F1 scores for “uncertainty detection” from 0.353 to 0.725. Extracted pneumothorax labels on data set 2 had a sensitivity of 0.997 [95 % CI: 0.994, 0.999] and specificity of 0.991 [95 % CI: 0.988, 0.994]. The model trained on publicly available data achieved an area under the receiver operating curve (AUC) for pneumothorax classification of 0.728 [95 % CI: 0.694, 0.760], while the models trained on automatically extracted labels and on manual annotations achieved values of 0.858 [95 % CI: 0.832, 0.882] and 0.934 [95 % CI: 0.918, 0.949], respectively. Conclusion Automatic label extraction from German thoracic radiology reports is a promising substitute for manual labeling. By reducing the time required for data annotation, larger training data sets can be created, resulting in improved overall modeling performance. Our results demonstrated that a pneumothorax classifier trained on automatically extracted labels strongly outperformed the model trained on publicly available data, without the need for additional annotation time and performed competitively compared to manually labeled data. Key points:

show abstract

Große Sprachmodelle von OpenAI, Google, Meta, X und Co.

Nowak,

Sprinkart

2024

Radiologie

View full text Add to dashboard Cite

Transformer-based structuring of free-text radiology report databases

Cited by 9 publications

References 14 publications

Language model-based labeling of German thoracic radiology reports

Language model-based labeling of German thoracic radiology reports

German CheXpert Chest X-ray Radiology Report Labeler

Große Sprachmodelle von OpenAI, Google, Meta, X und Co.

Contact Info

Product

Resources

About