Objective: Automated medical image segmentation (MIS) using deep learning has traditionally relied on models built and trained from scratch, or at least fine-tuned on a target dataset. The Segment Anything Model (SAM) by Meta challenges this paradigm by providing zero-shot generalisation capabilities. This study aims to develop and compare methods for refining traditional U-Net segmentations by repurposing them for automated SAM prompting.

Approach: A U-Net with EfficientNet-B4 encoder was trained using 4-fold cross validation on an in-house brain metastases dataset. Predictions from each validation set were used for automatic sparse prompt generation via a bounding box prompting method (BBPM) and novel implementations of the point prompting method (PPM). The PPMs frequently produced poor slice predictions (PSPs) that required identification and substitution. Slices were identified as PSPs when they (1) contained multiple predicted regions per lesion or (2) possessed outlier foreground pixel counts relative to the patient's other slices. PSPs were substituted with either initial U-Net or SAM BBPM predictions. The patients' mean volumetric dice similarity coefficient (DSC) was used to evaluate and compare methods' performances.

Main results: Compared to the initial U-Net segmentations, the BBPM increased patients' DSC by 3.93±1.48% to 0.847±0.008. PSPs constituted 20.01–21.63% of PPMs' predictions and without substitution, performance decreased by 82.94±3.17% to 0.139±0.023 DSC. When combining PSP identification techniques and substituting with BBPM predictions, the performance of PPMs surpassed the initial U-Net, achieving improvements of up to 4.17±1.40% and reaching a DSC of 0.849±0.007. The final PSP identification method achieved 92.95±1.20% PSP sensitivity.

Significance: Our results highlight strengths and weaknesses of sparse SAM prompting methods, which can be helpful for future designs of both automatic and manual pipelines. To our knowledge, this is the first study to propose methods of PSP identification and substitution to close the gap between PPM and BBPM performance for MIS.