Multiresolution Knowledge Distillation for Anomaly Detection

Salehi, Mostafa; Sadjadi, Niousha; Baselizadeh, Soroosh; Rohban, Mohammad Hossein; Rabiee, Hamid R.

doi:10.1109/cvpr46437.2021.01466

Cited by 346 publications

(190 citation statements)

References 21 publications

Supporting

Mentioning

188

Contrasting

Order By: Relevance

“…Given respective nominal representations and novel test representations, anomaly detection can then be a simple matter of reconstruction errors [44], distances to k nearest neighbours [18] or finetuning of a one-class classification model such as OC-SVMs [46] or SVDD [50,56] on top of these features. For the majority of these approaches, anomaly localization comes naturally based on pixel-wise reconstruction errors, saliency-based approaches such as GradCAM [47] or XRAI [28] can be used for anomaly segmentation [52,42,45] as well.…”

Section: Related Workmentioning

confidence: 99%

See 1 more Smart Citation

Towards Total Recall in Industrial Anomaly Detection

Roth¹,

Pemula²,

Zepeda³

et al. 2021

Preprint

View full text Add to dashboard Cite

Being able to spot defective parts is a critical component in large-scale industrial manufacturing. A particular challenge that we address in this work is the cold-start problem: fit a model using nominal (non-defective) example images only. While handcrafted solutions per class are possible, the goal is to build systems that work well simultaneously on many different tasks automatically. The best peforming approaches combine embeddings from ImageNet models with an outlier detection model. In this paper, we extend on this line of work and propose PatchCore, which uses a maximally representative memory bank of nominal patchfeatures. PatchCore offers competitive inference times while achieving state-of-the-art performance for both detection and localization. On the standard dataset MVTec AD Patch-Core achieves an image-level anomaly detection AUROC score of 99.1%, more than halving the error compared to the next best competitor. We further report competitive results on two additional datasets and also find competitive results in the few samples regime. * Work done during an internship at Amazon Tübingen.1 Commonly also dubbed one-class classification (OCC).

show abstract

Section: Related Workmentioning

confidence: 99%

“…To better account for the distribution shift between natural pretraining and industrial image data, subsequent adaptation can be done, e.g. via student-teacher knowledge distillation [24] such as in [6,45] or normalizing flows [17,30] trained on top of pretrained network features [42].…”

Section: Related Workmentioning

confidence: 99%

Towards Total Recall in Industrial Anomaly Detection

Roth¹,

Pemula²,

Zepeda³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…The uniformly generated semiorthogonal matrix [8] can avoid the singular case retaining the better performance while cubically reducing the computational cost for batch-inverse. We achieve new state-of-the-art results for the benchmark datasets, MVTec AD [9], KolektorSDD [10], KolektorSDD2 [11], and mSTC [12] while outperforming the competitive methods using reconstruction error-based [1][2][3] or knowledge distillation-based [4,5] methods with substantial margins. Moreover, we show that our method decoupled with the pre-trained CNNs can exploit the advances of discriminative models without a fine-tuning procedure.…”

Section: Introductionmentioning

confidence: 93%

“…The common idea is to train generative networks to minimize reconstruction errors learning low-dimensional features, and expect the higher error for the anomalies not presented in training than the anomaly-free. However, the networks with a sufficient capacity could restore even anomalies causing performance degradation, although the perceptual loss function for the generative networks [1] or the knowledge distillation loss for teacher-student pairs of networks [4,5] achieves a limited success.…”

Section: Introductionmentioning

confidence: 99%

“…This method is in accordance with the assumption of a Gaussian distribution having a single mode since the distribution for every locations of feature maps tends to be a multi-modal distribution. Third, the multi-scale features enable to detect the anomalies in the interactions among the different stages [5,7], along with various sizes of receptive-fields in the CNNs.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

Kim¹,

Kim²,

Yi³

et al. 2021

Preprint

View full text Add to dashboard Cite

We present the efficiency of semi-orthogonal embedding for unsupervised anomaly segmentation. The multi-scale features from pre-trained CNNs are recently used for the localized Mahalanobis distances with significant performance. However, the increased feature size is problematic to scale up to the bigger CNNs, since it requires the batch-inverse of multi-dimensional covariance tensor. Here, we generalize an ad-hoc method, random feature selection, into semi-orthogonal embedding for robust approximation, cubically reducing the computational cost for the inverse of multi-dimensional covariance tensor. With the scrutiny of ablation studies, the proposed method achieves a new state-of-the-art with significant margins for the MVTec AD, KolektorSDD, KolektorSDD2, and mSTC datasets. The theoretical and empirical analyses offer insights and verification of our straightforward yet cost-effective approach.

show abstract

Understanding the brain with attention: A survey of transformers in brain sciences

Chen,

Wang,

Chen

et al. 2023

Brain-X

View full text Add to dashboard Cite

Owing to their superior capabilities and advanced achievements, Transformers have gradually attracted attention with regard to understanding complex brain processing mechanisms. This study aims to comprehensively review and discuss the applications of Transformers in brain sciences. First, we present a brief introduction of the critical architecture of Transformers. Then, we overview and analyze their most relevant applications in brain sciences, including brain disease diagnosis, brain age prediction, brain anomaly detection, semantic segmentation, multi‐modal registration, functional Magnetic Resonance Imaging (fMRI) modeling, Electroencephalogram (EEG) processing, and multi‐task collaboration. We organize the model details and open sources for reference and replication. In addition, we discuss the quantitative assessments, model complexity, and optimization of Transformers, which are topics of great concern in the field. Finally, we explore possible future challenges and opportunities, exploiting some concrete and recent cases to provoke discussion and innovation. We hope that this review will stimulate interest in further research on Transformers in the context of brain sciences.

show abstract

Multiresolution Knowledge Distillation for Anomaly Detection

Cited by 346 publications

References 21 publications

Towards Total Recall in Industrial Anomaly Detection

Towards Total Recall in Industrial Anomaly Detection

Semi-orthogonal Embedding for Efficient Unsupervised Anomaly Segmentation

Understanding the brain with attention: A survey of transformers in brain sciences

Contact Info

Product

Resources

About