With the wide deployment of deep neural network (DNN) classifiers, there is great potential for harm from adversarial learning attacks. Recently, a special type of data poisoning (DP) attack, known as a backdoor, was proposed. These attacks do not seek to degrade classification accuracy, but rather to have the classifier learn to classify to a target class whenever the backdoor pattern is present in a test example. Launching backdoor attacks does not require knowledge of the classifier or its training process -it only needs the ability to poison the training set with (a sufficient number of) exemplars containing a sufficiently strong backdoor pattern (labeled with the target class). Defenses against backdoor DP attacks can be deployed before/during training, post-training, or inflight, i.e. during classifier operation/test time. Here, we address post-training detection of backdoor attacks in DNN image classifiers, seldom considered in existing works, wherein the defender does not have access to the poisoned training set, but only to the trained classifier itself, as well as to clean (unpoisoned) examples from the classification domain. This scenario is of great interest because a trained classifier may be the basis of e.g. a phone app that will be shared with many users. Detecting backdoors post-training may thus reveal a widespread attack. We propose a purely unsupervised anomaly detection (AD) defense against imperceptible backdoor attacks that: i) detects whether the trained DNN has been backdoor-attacked; ii) infers the source and target classes involved in a detected attack;iii) we even demonstrate it is possible to accurately estimate the backdoor pattern. Our AD approach involves learning (via suitable cost function minimization) the minimum size perturbation (putative backdoor) required to induce the classifier to misclassify (most) examples from class s to class t, for all (s, t) pairs. Our hypothesis is that non-attacked pairs require large perturbations, while attacked pairs require much smaller ones. This is convincingly borne out experimentally. We identify a variety of plausible cost functions and devise a novel, robust hypothesis testing approach to perform detection inference. We test our approach, in comparison with alternative defenses, for several backdoor patterns, data sets, and attack settings and demonstrate its favorability. Our defense essentially requires setting a single hyperparameter (the detection threshold), which can e.g. be chosen to fix the system's false positive rate.The first two authors contributed equally to this work.The authors are with the