Deep learning supports the differentiation of alcoholic and other-than-alcoholic cirrhosis based on MRI

Luetkens, Julian A.; Nowak, S.; Mesropyan, Narine; Block, Wolfgang; Praktiknjo, Michael; Chang, Johannes; Bauckhage, Christian; Sifa, Rafet; Sprinkart, Alois M.; Faron, Anton; Attenberger, Ulrike

doi:10.1038/s41598-022-12410-2

Cited by 17 publications

(6 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…When fine-tuning the models for text classification, we applied the following concepts. As proposed in previous studies, we fine-tuned all pre-trained models for text classification in two steps: First, frozen pre-trained language model parameters were used to adapt the new classification head and then all parameters were trained, but with layer-specific learning rates with maximum values increasing linearly from 10 −9 to 10 −6 from the first to the last layer [ 18 – 20 ]. Since the threshold for binarization of the predictions after sigmoid activation is not intrinsically set in multi-label classification, class-specific thresholds were determined by identifying the thresholds with the highest F1-scores on the training data [ 21 ].…”

Section: Methodsmentioning

confidence: 99%

Transformer-based structuring of free-text radiology report databases

et al. 2023

View full text Add to dashboard Cite

Objectives To provide insights for on-site development of transformer-based structuring of free-text report databases by investigating different labeling and pre-training strategies. Methods A total of 93,368 German chest X-ray reports from 20,912 intensive care unit (ICU) patients were included. Two labeling strategies were investigated to tag six findings of the attending radiologist. First, a system based on human-defined rules was applied for annotation of all reports (termed “silver labels”). Second, 18,000 reports were manually annotated in 197 h (termed “gold labels”) of which 10% were used for testing. An on-site pre-trained model (Tmlm) using masked-language modeling (MLM) was compared to a public, medically pre-trained model (Tmed). Both models were fine-tuned on silver labels only, gold labels only, and first with silver and then gold labels (hybrid training) for text classification, using varying numbers (N: 500, 1000, 2000, 3500, 7000, 14,580) of gold labels. Macro-averaged F1-scores (MAF1) in percent were calculated with 95% confidence intervals (CI). Results Tmlm,gold (95.5 [94.5–96.3]) showed significantly higher MAF1 than Tmed,silver (75.0 [73.4–76.5]) and Tmlm,silver (75.2 [73.6–76.7]), but not significantly higher MAF1 than Tmed,gold (94.7 [93.6–95.6]), Tmed,hybrid (94.9 [93.9–95.8]), and Tmlm,hybrid (95.2 [94.3–96.0]). When using 7000 or less gold-labeled reports, Tmlm,gold (N: 7000, 94.7 [93.5–95.7]) showed significantly higher MAF1 than Tmed,gold (N: 7000, 91.5 [90.0–92.8]). With at least 2000 gold-labeled reports, utilizing silver labels did not lead to significant improvement of Tmlm,hybrid (N: 2000, 91.8 [90.4–93.2]) over Tmlm,gold (N: 2000, 91.4 [89.9–92.8]). Conclusions Custom pre-training of transformers and fine-tuning on manual annotations promises to be an efficient strategy to unlock report databases for data-driven medicine. Key Points • On-site development of natural language processing methods that retrospectively unlock free-text databases of radiology clinics for data-driven medicine is of great interest. • For clinics seeking to develop methods on-site for retrospective structuring of a report database of a certain department, it remains unclear which of previously proposed strategies for labeling reports and pre-training models is the most appropriate in context of, e.g., available annotator time. • Using a custom pre-trained transformer model, along with a little annotation effort, promises to be an efficient way to retrospectively structure radiological databases, even if not millions of reports are available for pre-training.

show abstract

Section: Methodsmentioning

confidence: 99%

Transformer-based structuring of free-text radiology report databases

et al. 2023

View full text Add to dashboard Cite

show abstract

“…Luetkens et al used ResNet-50 and DenseNet-121 for the differentiation of alcoholic and other-than-alcoholic cirrhosis based on MRI. ResNet50 achieved the best results (ACC 0.75, AUC 0.82), however, the performance was not significantly higher compared to Densenet121 [ 39 ]. Remedios et al provided an ablation study to compare convolutional neural networks for detecting large-vessel occlusion on computed tomography angiography in 300 patients.…”

Section: Discussionmentioning

confidence: 99%

Two-Stage Deep Learning Model for Automated Segmentation and Classification of Splenomegaly

Meddeb

Kossen

Bressem

et al. 2022

Cancers

View full text Add to dashboard Cite

Splenomegaly is a common cross-sectional imaging finding with a variety of differential diagnoses. This study aimed to evaluate whether a deep learning model could automatically segment the spleen and identify the cause of splenomegaly in patients with cirrhotic portal hypertension versus patients with lymphoma disease. This retrospective study included 149 patients with splenomegaly on computed tomography (CT) images (77 patients with cirrhotic portal hypertension, 72 patients with lymphoma) who underwent a CT scan between October 2020 and July 2021. The dataset was divided into a training (n = 99), a validation (n = 25) and a test cohort (n = 25). In the first stage, the spleen was automatically segmented using a modified U-Net architecture. In the second stage, the CT images were classified into two groups using a 3D DenseNet to discriminate between the causes of splenomegaly, first using the whole abdominal CT, and second using only the spleen segmentation mask. The classification performances were evaluated using the area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE). Occlusion sensitivity maps were applied to the whole abdominal CT images, to illustrate which regions were important for the prediction. When trained on the whole abdominal CT volume, the DenseNet was able to differentiate between the lymphoma and liver cirrhosis in the test cohort with an AUC of 0.88 and an ACC of 0.88. When the model was trained on the spleen segmentation mask, the performance decreased (AUC = 0.81, ACC = 0.76). Our model was able to accurately segment splenomegaly and recognize the underlying cause. Training on whole abdomen scans outperformed training using the segmentation mask. Nonetheless, considering the performance, a broader and more general application to differentiate other causes for splenomegaly is also conceivable.

show abstract

“…Binary cross entropy loss, AdamW optimizer, a one cycle learning rate schedule with a maximum learning rate of 0.01, a weight decay of 0.01, and a batch size of 128 was used for training [ 14 ]. While fine-tuning the M S/G model on gold labels after training with silver labels, the maximum learning rate was reduced by a factor of 10 −1 per dense block from the last to the first block, as commonly done when applying pre-trained weights [ 15 , 16 ]. Detailed information on model architecture and training can be found in supplement S4.…”

Section: Methodsmentioning

confidence: 99%

Development of image-based decision support systems utilizing information extracted from radiological free-text report databases with text-based transformers

Nowak,

Schneider,

Layer

et al. 2023

Eur Radiol

Self Cite

View full text Add to dashboard Cite

Objectives To investigate the potential and limitations of utilizing transformer-based report annotation for on-site development of image-based diagnostic decision support systems (DDSS). Methods The study included 88,353 chest X-rays from 19,581 intensive care unit (ICU) patients. To label the presence of six typical findings in 17,041 images, the corresponding free-text reports of the attending radiologists were assessed by medical research assistants (“gold labels”). Automatically generated “silver” labels were extracted for all reports by transformer models trained on gold labels. To investigate the benefit of such silver labels, the image-based models were trained using three approaches: with gold labels only (MG), with silver labels first, then with gold labels (MS/G), and with silver and gold labels together (MS+G). To investigate the influence of invested annotation effort, the experiments were repeated with different numbers (N) of gold-annotated reports for training the transformer and image-based models and tested on 2099 gold-annotated images. Significant differences in macro-averaged area under the receiver operating characteristic curve (AUC) were assessed by non-overlapping 95% confidence intervals. Results Utilizing transformer-based silver labels showed significantly higher macro-averaged AUC than training solely with gold labels (N = 1000: MG 67.8 [66.0–69.6], MS/G 77.9 [76.2–79.6]; N = 14,580: MG 74.5 [72.8–76.2], MS/G 80.9 [79.4–82.4]). Training with silver and gold labels together was beneficial using only 500 gold labels (MS+G 76.4 [74.7–78.0], MS/G 75.3 [73.5–77.0]). Conclusions Transformer-based annotation has potential for unlocking free-text report databases for the development of image-based DDSS. However, on-site development of image-based DDSS could benefit from more sophisticated annotation pipelines including further information than a single radiological report. Clinical relevance statement Leveraging clinical databases for on-site development of artificial intelligence (AI)–based diagnostic decision support systems by text-based transformers could promote the application of AI in clinical practice by circumventing highly regulated data exchanges with third parties. Key Points • The amount of data from a database that can be used to develop AI-assisted diagnostic decision systems is often limited by the need for time-consuming identification of pathologies by radiologists. • The transformer-based structuring of free-text radiological reports shows potential to unlock corresponding image databases for on-site development of image-based diagnostic decision support systems. • However, the quality of image annotations generated solely on the content of a single radiology report may be limited by potential inaccuracies and incompleteness of this report.

show abstract

Deep learning supports the differentiation of alcoholic and other-than-alcoholic cirrhosis based on MRI

Cited by 17 publications

References 23 publications

Transformer-based structuring of free-text radiology report databases

Transformer-based structuring of free-text radiology report databases

Two-Stage Deep Learning Model for Automated Segmentation and Classification of Splenomegaly

Development of image-based decision support systems utilizing information extracted from radiological free-text report databases with text-based transformers

Contact Info

Product

Resources

About