Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis

Gao, Cong; Killeen, Benjamin D.; Hu, Yihe; Grupp, Robert B.; Taylor, Russell H.; Armand, Mehran; Unberath, Mathias

doi:10.1038/s42256-023-00629-1

Cited by 39 publications

(12 citation statements)

References 78 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Stochastic generative models, such as the one presented in this work, allow the generation of large amounts of data with specific statistics, which are useful for validating methodologies of analysis [67] or feeding machine learning algorithms when real data are scarce [68,69]. Furthermore, generative models provide new insights into the studied process itself by characterizing how the model operates to produce the stochastic field [66].…”

Section: Discussionmentioning

confidence: 99%

A multiscale and multicriteria generative adversarial network to synthesize 1-dimensional turbulent fields

Granero Belinchon,

Cabeza Gallucci

2024

Mach. Learn.: Sci. Technol.

View full text Add to dashboard Cite

This article introduces a new Neural Network stochastic model to generate a 1-dimensional stochastic ﬁeld with turbulent velocity statistics. Both the model architecture and training procedure ground on the Kolmogorov and Obukhov statistical theories of fully developed turbulence, so guaranteeing descriptions of 1) energy distribution, 2) energy cascade and 3) intermittency across scales in agreement with experimental observations. The model is a Generative Adversarial Network with multiple multiscale optimization criteria. First, we use three physics-based criteria: the variance, skewness and ﬂatness of the increments of the generated ﬁeld that retrieve respectively the turbulent energy distribution, energy cascade and intermittency across scales. Second, the Generative Adversarial Network criterion, based on reproducing statistical distributions, is used on segments of different length of the generated ﬁeld. Furthermore, to mimic multiscale decompositions frequently used in turbulence’s studies, the model architecture is fully convolutional with kernel sizes varying along the multiple layers of the model. To train our model we use turbulent velocity signals from grid turbulence at Modane wind tunnel.

show abstract

Section: Discussionmentioning

confidence: 99%

A multiscale and multicriteria generative adversarial network to synthesize 1-dimensional turbulent fields

Granero Belinchon,

Cabeza Gallucci

2024

Mach. Learn.: Sci. Technol.

View full text Add to dashboard Cite

show abstract

“…Therefore, the DeepDRR framework 27 is used in this work, because it could simulate physics‐based radiographs by considering spectrum‐aware forward projection, beam hardening, scatter estimation, and noise injection 20 . In addition, a recent publication in Nature Machine Intelligence underscores the effectiveness of DeepDRR in the advancement of generalizable deep learning‐based algorithms for real x‐ray image analysis 28 . In the process of forward projection, the source‐to‐detector distance is

1164\,\textrm {mm}

and the source‐to‐isocenter distance is

700\,\textrm {mm}

.…”

Section: Methodsmentioning

confidence: 99%

Simulation‐driven training of vision transformers enables metal artifact reduction of highly truncated CBCT scans

Fan,

Ritschl,

Beister

et al. 2023

Medical Physics

View full text Add to dashboard Cite

BackgroundDue to the high attenuation of metals, severe artifacts occur in cone beam computed tomography (CBCT). The metal segmentation in CBCT projections usually serves as a prerequisite for metal artifact reduction (MAR) algorithms.PurposeThe occurrence of truncation caused by the limited detector size leads to the incomplete acquisition of metal masks from the threshold‐based method in CBCT volume. Therefore, segmenting metal directly in CBCT projections is pursued in this work.MethodsSince the generation of high quality clinical training data is a constant challenge, this study proposes to generate simulated digital radiographs (data I) based on real CT data combined with self‐designed computer aided design (CAD) implants. In addition to the simulated projections generated from 3D volumes, 2D x‐ray images combined with projections of implants serve as the complementary data set (data II) to improve the network performance. In this work, SwinConvUNet consisting of shift window (Swin) vision transformers (ViTs) with patch merging as encoder is proposed for metal segmentation.ResultsThe model's performance is evaluated on accurately labeled test datasets obtained from cadaver scans as well as the unlabeled clinical projections. When trained on the data I only, the convolutional neural network (CNN) encoder‐based networks UNet and TransUNet achieve only limited performance on the cadaver test data, with an average dice score of 0.821 and 0.850. After using both data II and data I during training, the average dice scores for the two models increase to 0.906 and 0.919, respectively. By replacing the CNN encoder with Swin transformer, the proposed SwinConvUNet reaches an average dice score of 0.933 for cadaver projections when only trained on the data I. Furthermore, SwinConvUNet has the largest average dice score of 0.953 for cadaver projections when trained on the combined data set.ConclusionsOur experiments quantitatively demonstrate the effectiveness of the combination of the projections simulated under two pathways for network training. Besides, the proposed SwinConvUNet trained on the simulated projections performs state‐of‐the‐art, robust metal segmentation as demonstrated on experiments on cadaver and clinical data sets. With the accurate segmentations from the proposed model, MAR can be conducted even for highly truncated CBCT scans.

show abstract

“…These are placed along the field of view with random location and orientation. We project digitally reconstructed radiographs (DRRs) from the patient CT and tool mesh models using a modified version of the DeepDRR simulator [26], which has been shown to support sim-to-real transfer for X-rays [8,11,15,15]. Using this version, we simultaneously obtain realistic X-ray tansmission images and corresponding projected segmentations for each organ or tool present, enabling synchronous dataset generation of 448×448 images at a rate of ∼ 4 images / s on an RTX 2080 Ti, using less than 4 GB of GPU memory.…”

Section: A Large Scale Dataset For X-ray Image Analysismentioning

confidence: 99%

“…X-ray imaging is a workhorse imaging modality for diagnostic and interventional healthcare. There is enormous opportunity for quantitative, comprehensive, and automated segmentation of X-ray images to accelerate research and development in precision medicine [3,4,8,14,16,25,26,29]. Prior efforts have contributed machine learning techniques for X-ray image analysis that perform well within a narrow scope, but fail to apply broadly to a large swath of possible uses.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Toward perception-based anticipation of cortical breach during K-wire fixation of the pelvis

Killeen

Chakraborty

Osgood

et al. 2022

Medical Imaging 2022: Physics of Medical Imaging

View full text Add to dashboard Cite

Automated X-ray image segmentation would accelerate research and development in diagnostic and interventional precision medicine. Prior efforts have contributed task-specific models capable of solving specific image analysis problems, but the utility of these models is restricted to their particular task domain, and expanding to broader use requires additional data, labels, and retraining efforts. Recently, foundation models (FMs) -machine learning models trained on large amounts of highly variable data thus enabling broad applicability -have emerged as promising tools for automated image analysis. Existing FMs for medical image analysis focus on scenarios and modalities where objects are clearly defined by visually apparent boundaries, such as surgical tool segmentation in endoscopy. X-ray imaging, by contrast, does not generally offer such clearly delineated boundaries or structure priors. During X-ray image formation, complex 3D structures are projected in transmission onto the imaging plane, resulting in overlapping features of varying opacity and shape. To pave the way toward an FM for comprehensive and automated analysis of arbitrary medical X-ray images, we develop FluoroSAM, a language-aligned variant of the Segment-Anything Model, trained from scratch on 1.6M synthetic X-ray images from a wide variety of human anatomies, X-ray projection geometries, energy spectra, and viewing angles. FluoroSAM is trained on data including masks for 128 organ types and 464 non-anatomical objects, such as tools and implants. In real X-ray images of cadaveric specimens, FluoroSAM is able to segment bony anatomical structures based on text-only prompting with 0.51 and 0.79 DICE with point-based refinement, outperforming competing SAM variants for all structures. FluoroSAM is also capable of zero-shot generalization to segmenting classes beyond the training set thanks to its language alignment, which we demonstrate for full lung segmentation on real chest X-rays. Code, data, and model weights are available. 1

show abstract

Synthetic data accelerates the development of generalizable learning-based algorithms for X-ray image analysis

Cited by 39 publications

References 78 publications

A multiscale and multicriteria generative adversarial network to synthesize 1-dimensional turbulent fields

A multiscale and multicriteria generative adversarial network to synthesize 1-dimensional turbulent fields

Simulation‐driven training of vision transformers enables metal artifact reduction of highly truncated CBCT scans

Toward perception-based anticipation of cortical breach during K-wire fixation of the pelvis

Contact Info

Product

Resources

About