Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification

Luo, Yuan; Chen, Zhineng; Gao, Xieping

doi:10.48550/arxiv.2203.16983

Cited by 3 publications

(7 citation statements)

References 23 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Hence transferring decoder information to the encoder with self-distillation improves the outcomes of self-learning. We also observe that similar to [6,7] and unlike [10], our results show that predicting the masked area only outperforms predicting all image pixels for both SimMIM and our SD-SimMIM.…”

Section: Quantitative Resultssupporting

confidence: 66%

“…We believe that the visible patches in the decoder contain more knowledge than the ones in the encoder. Moreover, similar to [6,7] and unlike [10], we found out that predicting the masked area only outperforms predicting all image pixels.…”

Section: Introductionsupporting

confidence: 65%

“…Knowledge Distillation is the process of transferring knowledge from a large model to a smaller one [13]. Previous studies apply it to the vectors at various depths within the same network, either a convolutional neural network (CNN) [14] or a Vision Transformer (ViT) [10]. Hence, knowledge is distilled from deep layers to shallow layers, augmenting the feature representation of shallow layers.…”

Section: Self-distillationmentioning

confidence: 99%

“…Inspired by [6,7,10], we hypothesize that the Swin encoder can be improved by transferring knowledge obtained by decoded visible patches to their encoded peers through selfdistillation. We believe that the visible patches in the decoder contain more knowledge than the ones in the encoder.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs

Almalki

Latecki

2023

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

View full text Add to dashboard Cite

The computer-assisted radiologic informative report has received increasing research attention to facilitate diagnosis and treatment planning for dental care providers. However, manual interpretation of dental images is limited, expensive, and time-consuming. Another barrier in dental imaging is the limited number of available images for training, which is a challenge in the era of deep learning. This study proposes a novel self-distillation (SD) enhanced self-supervised learning on top of the masked image modeling (SimMIM) Transformer, called SD-SimMIM, to improve the outcome with a limited number of dental radiographs. In addition to the prediction loss on masked patches, SD-SimMIM computes the self-distillation loss on the visible patches. We apply SD-SimMIM on dental panoramic X-rays for teeth numbering, detection of dental restorations and orthodontic appliances, and instance segmentation tasks. Our results show that SD-SimMIM outperforms other self-supervised learning methods. Furthermore, we augment and improve the annotation of an existing dataset of panoramic X-rays.

show abstract

Section: Quantitative Resultssupporting

confidence: 66%

Section: Introductionsupporting

confidence: 65%

Section: Self-distillationmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs

Almalki

Latecki

2023

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

View full text Add to dashboard Cite

show abstract

“…Macro F1 and ROC AUC are reported for multi-class classification tasks. It should be noted that the “background” (BACK) class is not considered for NCT-CRC-HE neither for training nor for evaluation, following (23, 50, 77, 78).…”

Section: Experimental and Evaluation Setupmentioning

confidence: 99%

Scaling Self-Supervised Learning for Histopathology with Masked Image Modeling

Filiot¹,

Ghermi²,

Olivier³

et al. 2023

Preprint

View full text Add to dashboard Cite

Computational pathology is revolutionizing the field of pathology by integrating advanced computer vision and machine learning technologies into diagnostic workflows. Recently, self-supervised learning (SSL) has emerged as a promising solution to learn representations from histology patches, leveraging large volumes of unannotated whole slide images (WSI). In particular, Masked Image Modeling (MIM) showed remarkable results and robustness over purely contrastive learning methods. In this work, we explore the application of MIM to histology using iBOT, a self-supervised transformer-based framework. Through a wide range of downstream tasks over seven cancer indications, we provide recommendations on the pre-training of large models for histology data using MIM. First, we demonstrate that in-domain pre-training with iBOT outperforms both ImageNet pre-training and a model pre-trained with a purely contrastive learning objective, MoCo V2. Second, we show that Vision Transformers models (ViT), when scaled appropriately, have the capability to learn pan-cancer representations that benefit a large variety of downstream tasks. Finally, our iBOT ViT-Base model, pre-trained on more than 40 million histology images from 16 different cancer types, achieves state-of-the-art performance in most weakly-supervised WSI classification tasks compared to other SSL frameworks.

show abstract

Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology

Wadduwage,

Sivaroopan,

Jayanga

et al. 2023

Preprint

View full text Add to dashboard Cite

Deep neural network models can learn clinically relevant features from millions of histopathology images. However generating high-quality annotations to train such models for each hospital, each cancer type, and each diagnostic task is prohibitively laborious. On the other hand, terabytes of training data –while lacking reliable annotations– are readily available in the public domain in some cases. In this work, we explore how these large datasets can be consciously utilized to pre-train deep networks to encode informative representations. We then fine-tune our pre-trained models on a fraction of annotated training data to perform specific downstream tasks. We show that our approach can reach the current state-of-the-art (SOTA) for patch-level classification with only 1-10% randomly selected annotations compared to other SOTA approaches. Moreover, we propose an uncertainty-aware loss function, to quantify the model confidence during inference. Quantified uncertainty helps experts select the best instances to label for further training. Our uncertainty-aware labeling reaches the SOTA with significantly fewer annotations compared to random labeling. Last, we demonstrate how our pre-trained encoders can surpass current SOTA for whole-slide image classification with weak supervision. Our work lays the foundation for data and task-agnostic pre-trained deep networks with quantified uncertainty.

show abstract

Self-distillation Augmented Masked Autoencoders for Histopathological Image Classification

Cited by 3 publications

References 23 publications

Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs

Self-Supervised Learning with Masked Image Modeling for Teeth Numbering, Detection of Dental Restorations, and Instance Segmentation in Dental Panoramic Radiographs

Scaling Self-Supervised Learning for Histopathology with Masked Image Modeling

Contrastive Deep Encoding Enables Uncertainty-aware Machine-learning-assisted Histopathology

Contact Info

Product

Resources

About