Purpose
To train deep learning models to differentiate benign and malignant breast tumors in ultrasound images, we need to collect many training samples with clear labels. In general, biopsy results can be used as benign/malignant labels. However, most clinical samples generally do not have biopsy results. Previous works have proposed generating benign/malignant labels according to Breast Imaging, Reporting and Data System (BI‐RADS) ratings. However, this approach will cause noisy labels, which means that the benign/malignant labels produced from BI‐RADS diagnoses may be inconsistent with the ground truths. Consequently, deep models will overfit the noisy labels and hence obtain poor generalization performance. In this work, we mainly focus on how to reduce the negative effect of noisy labels when they are used to train breast tumor classification models.
Methods
We propose an effective approach called noise filter network (NF‐Net) to address the problem of noisy labels when training breast tumor classification models. Specifically, to prevent deep models from overfitting the noisy labels, we propose incorporating two softmax layers for classification. Additionally, to strengthen the effect of clean labels, we design a teacher–student module for distilling the knowledge of clean labels.
Results
We conduct extensive comparisons with the existing works on addressing noisy labels. Our method achieves a classification accuracy of 73%, with a precision of 69%, recall of 80%, and F1‐score of 0.74. This result is significantly better than those of the existing state‐of‐the‐art works on addressing noisy labels.
Conclusions
This work provides a means to overcome the label shortage problem in training breast tumor classification models. Specifically, we can generate benign/malignant labels according to the BI‐RADS ratings. Although this approach will cause noisy labels, the design of NF‐Net can effectively reduce the negative effect of such labels.