2023
DOI: 10.1002/mp.16219
|View full text |Cite
|
Sign up to set email alerts
|

Semi‐supervised training using cooperative labeling of weakly annotated data for nodule detection in chest CT

Abstract: Purpose: Machine learning algorithms are best trained with large quantities of accurately annotated samples. While natural scene images can often be labeled relatively cheaply and at large scale, obtaining accurate annotations for medical images is both time consuming and expensive. In this study, we propose a cooperative labeling method that allows us to make use of weakly annotated medical imaging data for the training of a machine learning algorithm. As most clinically produced data are weakly-annotated -pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 27 publications
0
4
0
Order By: Relevance
“…The AI/ML research program 80 strives to address these challenges by conducting peerreviewed research to develop and understand methods for enhanced AI/ML training, developing systematic approaches for understanding AI/ML robustness, and assessing novel test methodologies to evaluate fixed and continuously learning AI/ML performance in both the premarket and real-world settings, to name just a few areas of ongoing research. Some of the regulatory science projects being conducted as part of the OSEL AI/ML research program include a recent investigation developing a cooperative labeling technique to incorporate weakly labeled data into the training of a deep learning AI/ML model for lung nodule detection in CT. 81 This study showed that the inclusion of weakly labeled data leads to a 5% improvement in lung nodule detection performance when the number of expert annotations is limited. Another approach for addressing small dataset sizes is to augment available data with synthetic datasets.…”
Section: Discussionmentioning
confidence: 83%
See 1 more Smart Citation
“…The AI/ML research program 80 strives to address these challenges by conducting peerreviewed research to develop and understand methods for enhanced AI/ML training, developing systematic approaches for understanding AI/ML robustness, and assessing novel test methodologies to evaluate fixed and continuously learning AI/ML performance in both the premarket and real-world settings, to name just a few areas of ongoing research. Some of the regulatory science projects being conducted as part of the OSEL AI/ML research program include a recent investigation developing a cooperative labeling technique to incorporate weakly labeled data into the training of a deep learning AI/ML model for lung nodule detection in CT. 81 This study showed that the inclusion of weakly labeled data leads to a 5% improvement in lung nodule detection performance when the number of expert annotations is limited. Another approach for addressing small dataset sizes is to augment available data with synthetic datasets.…”
Section: Discussionmentioning
confidence: 83%
“…Some of the regulatory science projects being conducted as part of the OSEL AI/ML research program include a recent investigation developing a cooperative labeling technique to incorporate weakly labeled data into the training of a deep learning AI/ML model for lung nodule detection in CT 81 . This study showed that the inclusion of weakly labeled data leads to a 5% improvement in lung nodule detection performance when the number of expert annotations is limited.…”
Section: Discussionmentioning
confidence: 99%
“…In the first set of experiments, we train the false positive reduction algorithm 8 using augmentations of flip, rotation, and translation which are considered to be the baseline approach for common data augmentation methods for this application. 8 The result of these experiments are reported in Table 1. Each row of the table reports the sensitivity and CPM performance when a given number of LIDC scans are used for training, followed by the results when the LIDC dataset is complemented by the data from 569 phantom scans.…”
Section: Resultsmentioning
confidence: 99%
“…The training framework including the hyper-parameters and neural network architecture are selected in an identical manner to our previous false positive reduction network. 8 This network consists of successive threedimensional convolutions, max-poolings and Leaky ReLU activations, followed by two dense fully connected layers, assigning nodule scores to each ROI. The training data however is presented to the network with a new augmentation paradigm; during the training, each positive and negative ROI is passed through an image augmentation selected at random during the training.…”
Section: Training Frameworkmentioning
confidence: 99%