Density of States Estimation for Out-of-Distribution Detection

Morningstar, Warren R.; Ham, Cusuh; Gallagher, Andrew; Lakshminarayanan, Balaji; Alemi, Alexander A.; Dillon, Joshua V.

doi:10.48550/arxiv.2006.09273

Cited by 8 publications

(12 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Bulusu et al, 2020, for a recent survey]. In particular, HCL detects task changes by measuring the typicality of the model's statistics, which is similar to recently proposed state-of-the-art OOD detection methods by Nalisnick et al [2019c] and Morningstar et al [2020]. In some of our experiments, we apply HCL to embeddings extracted by a deep neural network; develop a related method for OOD detection, where a flow-based generative model approximates the density of intermediate representations of the data.…”

Section: Related Workmentioning

confidence: 89%

“…Similarly to prior work on anomaly detection [Nalisnick et al, 2019c] and[Morningstar et al, 2020], we detect task changes measuring the typicality of the HCL model's statistics. Following Morningstar et al [2020], we can use the following statistics on data batches B:…”

Section: Task Identificationmentioning

confidence: 99%

“…Intuitively, the model will not be able to classify unknown task samples correctly when the data distribution shifts, so the task-conditional likelihood p(B|t) = p(y|x, t)p(x|t) of the batch B which comes from a new task t should be low. Moreover, motivated by recent advances in OOD detection with generative models [Nalisnick et al, 2019c, Morningstar et al, 2020, we propose to detect task changes using two-sided test on HCL's multiple statistics and demonstrate that HCL is able to correctly identify task change not only in standard CL benchmarks, but also in FashionMNIST-MNIST continual learning problem, which is a more challenging scenario as identified in Nalisnick et al [2019a]. Note that prior works in continual learning which are based on a VAE model (Rao et al [2019] and Lee et al [2020]) rely on VAE's likelihood to determine task change points which may not be reliable in challenging settings [Nalisnick et al, 2019a].…”

Section: Task Identificationmentioning

confidence: 99%

See 2 more Smart Citations

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Kirichenko,

Farajtabar,

Rao

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

Learning new tasks continuously without forgetting on a constantly changing data distribution is essential for real-world problems but extremely challenging for modern deep learning. In this work we propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. We model the distribution of each task and each class with a normalizing flow. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting, all leveraging the invertibility and exact likelihood which are uniquely enabled by the normalizing flow model. We use the generative capabilities of the flow to avoid catastrophic forgetting through generative replay and a novel functional regularization technique. For task identification, we use state-of-the-art anomaly detection techniques based on measuring the typicality of the model's statistics. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST. * Work partially done as an intern at DeepMind.

show abstract

Section: Related Workmentioning

confidence: 89%

Section: Task Identificationmentioning

confidence: 99%

Section: Task Identificationmentioning

confidence: 99%

See 1 more Smart Citation

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Kirichenko,

Farajtabar,

Rao

et al. 2021

Preprint

Self Cite

View full text Add to dashboard Cite

show abstract

“…Other methods focus on the data distribution directly: Nalisnick et al (2019a) discovered that the density learned by generative models cannot distinguish between ID and OOD inputs. Various works study this observation identifying background statistic (Ren et al, 2019), excessive influence of input complexity (Serrà et al, 2020), and mismatch between the typical set and highdensity regions (Nalisnick et al, 2019b;Choi et al, 2019;Morningstar et al, 2020) as causes. In comparison to our work, these methods focus on flow-based and autoregressive density methods with tractable likelihood.…”

Section: Related Workmentioning

confidence: 99%

On Out-of-distribution Detection with Energy-based Models

Elflein¹,

Charpentier²,

Zügner³

et al. 2021

Preprint

View full text Add to dashboard Cite

Several density estimation methods have shown to fail to detect out-of-distribution (OOD) samples by assigning higher likelihoods to anomalous data. Energy-based models (EBMs) are flexible, unnormalized density models which seem to be able to improve upon this failure mode. In this work, we provide an extensive study investigating OOD detection with EBMs trained with different approaches on tabular and image data and find that EBMs do not provide consistent advantages. We hypothesize that EBMs do not learn semantic features despite their discriminative structure similar to Normalizing Flows. To verify this hypotheses, we show that supervision and architectural restrictions improve the OOD detection of EBMs independent of the training approach.

show abstract

“…First note that, similarly to tabular data, semantic node features are less likely to suffer from the same flaws. Second, following previous works [14,15,46,68,97], GPN mitigates this issue by using density estimation on a latent space which is low-dimensional and task-specific. Nonetheless, we emphasize that GPN provides predictive uncertainty estimates which depends on the considered task i.e.…”

Section: Limitations and Impactmentioning

confidence: 99%

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification

Stadler¹,

Charpentier²,

Geisler³

et al. 2021

Preprint

View full text Add to dashboard Cite

The interdependence between nodes in graphs is key to improve class predictions on nodes and utilized in approaches like Label Propagation (LP) or in Graph Neural Networks (GNNs). Nonetheless, uncertainty estimation for non-independent node-level predictions is under-explored. In this work, we explore uncertainty quantification for node classification in three ways: (1) We derive three axioms explicitly characterizing the expected predictive uncertainty behavior in homophilic attributed graphs. (2) We propose a new model Graph Posterior Network (GPN) which explicitly performs Bayesian posterior updates for predictions on interdependent nodes. GPN provably obeys the proposed axioms. (3) We extensively evaluate GPN and a strong set of baselines on semi-supervised node classification including detection of anomalous features, and detection of left-out classes. GPN outperforms existing approaches for uncertainty estimation in the experiments.

show abstract

Density of States Estimation for Out-of-Distribution Detection

Cited by 8 publications

References 10 publications

Task-agnostic Continual Learning with Hybrid Probabilistic Models

Task-agnostic Continual Learning with Hybrid Probabilistic Models

On Out-of-distribution Detection with Energy-based Models

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification

Contact Info

Product

Resources

About