Learning a Loss Function for Segmentation: A Feasibility Study

Moltz, Jan Hendrik; Hänsch, Annika; Lassen-Schmidt, Bianca; Haas, Benjamin de; Genghi, Angelo; Schreier, Jan; Morgaś, Tomasz; Klein, Jan

doi:10.1109/isbi45749.2020.9098557

Cited by 10 publications

(7 citation statements)

References 2 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The training set was divided into the training and validation datasets using a ratio of approximately 90%:10% (1175:123). We used a 3D U-Net architecture based on the Dice loss function for detecting a relatively small BM compared to the brain (35)(36)(37)(38). A 3D structure is advantageous over a 2D structure for recognizing the edges of BM.…”

Section: Development Of the Cad Softwarementioning

confidence: 99%

Deep Learning-Based Computer-Aided Detection System for Automated Treatment Response Assessment of Brain Metastases on 3D MRI

et al. 2021

View full text Add to dashboard Cite

BackgroundAlthough accurate treatment response assessment for brain metastases (BMs) is crucial, it is highly labor intensive. This retrospective study aimed to develop a computer-aided detection (CAD) system for automated BM detection and treatment response evaluation using deep learning.MethodsWe included 214 consecutive MRI examinations of 147 patients with BM obtained between January 2015 and August 2016. These were divided into the training (174 MR images from 127 patients) and test datasets according to temporal separation (temporal test set #1; 40 MR images from 20 patients). For external validation, 24 patients with BM and 11 patients without BM from other institutions were included (geographic test set). In addition, we included 12 MRIs from BM patients obtained between August 2017 and March 2020 (temporal test set #2). Detection sensitivity, dice similarity coefficient (DSC) for segmentation, and agreements in one-dimensional and volumetric Response Assessment in Neuro-Oncology Brain Metastases (RANO-BM) criteria between CAD and radiologists were assessed.ResultsIn the temporal test set #1, the sensitivity was 75.1% (95% confidence interval [CI]: 69.6%, 79.9%), mean DSC was 0.69 ± 0.22, and false-positive (FP) rate per scan was 0.8 for BM ≥ 5 mm. Agreements in the RANO-BM criteria were moderate (κ, 0.52) and substantial (κ, 0.68) for one-dimensional and volumetric, respectively. In the geographic test set, sensitivity was 87.7% (95% CI: 77.2%, 94.5%), mean DSC was 0.68 ± 0.20, and FP rate per scan was 1.9 for BM ≥ 5 mm. In the temporal test set #2, sensitivity was 94.7% (95% CI: 74.0%, 99.9%), mean DSC was 0.82 ± 0.20, and FP per scan was 0.5 (6/12) for BM ≥ 5 mm.ConclusionsOur CAD showed potential for automated treatment response assessment of BM ≥ 5 mm.

show abstract

Section: Development Of the Cad Softwarementioning

confidence: 99%

Deep Learning-Based Computer-Aided Detection System for Automated Treatment Response Assessment of Brain Metastases on 3D MRI

et al. 2021

View full text Add to dashboard Cite

show abstract

“…For each patient, the geometric agreement with the GS contours was assessed with multiple measures for the MDA+RO, and DL+RO contours, as well as the unrevised contours from the autosegmentation model (which will be referred to as the DL arm) with multiple measures: volumetric Dice similarity coefficient ( 64 ) (VDSC), surface Dice similarity coefficient (SDCS, with τ=1, 1.5, 2, and 3mm) ( 19 ), 95-percentile Hausdorff distance ( 64 ) (HD95%), added path length (APL, computed with tolerances of 1, 2, 3, and 5 mm) ( 65 ), precision ( 64 ), sensitivity ( 64 ), contour Dice coefficient (CDC) ( 66 ), and the change in volume and centroid of structure.…”

Section: Methodsmentioning

confidence: 99%

Validation of clinical acceptability of deep-learning-based automated segmentation of organs-at-risk for head-and-neck radiotherapy treatment planning

et al. 2023

View full text Add to dashboard Cite

IntroductionOrgan-at-risk segmentation for head and neck cancer radiation therapy is a complex and time-consuming process (requiring up to 42 individual structure, and may delay start of treatment or even limit access to function-preserving care. Feasibility of using a deep learning (DL) based autosegmentation model to reduce contouring time without compromising contour accuracy is assessed through a blinded randomized trial of radiation oncologists (ROs) using retrospective, de-identified patient data.MethodsTwo head and neck expert ROs used dedicated time to create gold standard (GS) contours on computed tomography (CT) images. 445 CTs were used to train a custom 3D U-Net DL model covering 42 organs-at-risk, with an additional 20 CTs were held out for the randomized trial. For each held-out patient dataset, one of the eight participant ROs was randomly allocated to review and revise the contours produced by the DL model, while another reviewed contours produced by a medical dosimetry assistant (MDA), both blinded to their origin. Time required for MDAs and ROs to contour was recorded, and the unrevised DL contours, as well as the RO-revised contours by the MDAs and DL model were compared to the GS for that patient.ResultsMean time for initial MDA contouring was 2.3 hours (range 1.6-3.8 hours) and RO-revision took 1.1 hours (range, 0.4-4.4 hours), compared to 0.7 hours (range 0.1-2.0 hours) for the RO-revisions to DL contours. Total time reduced by 76% (95%-Confidence Interval: 65%-88%) and RO-revision time reduced by 35% (95%-CI,-39%-91%). All geometric and dosimetric metrics computed, agreement with GS was equivalent or significantly greater (p<0.05) for RO-revised DL contours compared to the RO-revised MDA contours, including volumetric Dice similarity coefficient (VDSC), surface DSC, added path length, and the 95%-Hausdorff distance. 32 OARs (76%) had mean VDSC greater than 0.8 for the RO-revised DL contours, compared to 20 (48%) for RO-revised MDA contours, and 34 (81%) for the unrevised DL OARs.ConclusionDL autosegmentation demonstrated significant time-savings for organ-at-risk contouring while improving agreement with the institutional GS, indicating comparable accuracy of DL model. Integration into the clinical practice with a prospective evaluation is currently underway.

show abstract

“…We used the following well-known and common metrics to evaluate the similarity of two segmentations (manually or automatically generated): Dice score, mean surface distance, and Hausdorff distance. In addition, the contour Dice score 29 was used which measures the fraction of the axial contours that lie within a predefined tolerance (here 1, 3, 5, 7, and 10 mm) of the reference contour. The metric is a contour-based version of the surface Dice score, 19 as corrections in RT planning are typically based on contours, not on surfaces.…”

Section: Methodsmentioning

confidence: 99%

Hippocampus segmentation in CT using deep learning: impact of MR versus CT-based training contours

et al. 2020

View full text Add to dashboard Cite

. Purpose: Hippocampus contouring for radiotherapy planning is performed on MR image data due to poor anatomical visibility on computed tomography (CT) data. Deep learning methods for direct CT hippocampus auto-segmentation exist, but use MR-based training contours. We investigate if these can be replaced by CT-based contours without loss in segmentation performance. This would remove the MR not only from inference but also from training. Approach: The hippocampus was contoured by medical experts on MR and CT data of 45 patients. Convolutional neural networks (CNNs) for hippocampus segmentation on CT were trained on CT-based or propagated MR-based contours. In both cases, their predictions were evaluated against the MR-based contours considered as the ground truth. Performance was measured using several metrics, including Dice score, surface distances, and contour Dice score. Bayesian dropout was used to estimate model uncertainty. Results: CNNs trained on propagated MR contours (median Dice 0.67) significantly outperform those trained on CT contours (0.59) and also experts contouring manually on CT (0.59). Differences between the latter two are not significant. Training on MR contours results in lower model uncertainty than training on CT contours. All contouring methods (manual or CNN) on CT perform significantly worse than a CNN segmenting the hippocampus directly on MR (median Dice 0.76). Additional data augmentation by rigid transformations improves the quantitative results but the difference remains significant. Conclusions: CT-based training contours for CT hippocampus segmentation cannot replace propagated MR-based contours without significant loss in performance. However, if MR-based contours are used, the resulting segmentations outperform experts in contouring the hippocampus on CT.

show abstract

Learning a Loss Function for Segmentation: A Feasibility Study

Cited by 10 publications

References 2 publications

Deep Learning-Based Computer-Aided Detection System for Automated Treatment Response Assessment of Brain Metastases on 3D MRI

Deep Learning-Based Computer-Aided Detection System for Automated Treatment Response Assessment of Brain Metastases on 3D MRI

Validation of clinical acceptability of deep-learning-based automated segmentation of organs-at-risk for head-and-neck radiotherapy treatment planning

Hippocampus segmentation in CT using deep learning: impact of MR versus CT-based training contours

Contact Info

Product

Resources

About