Radial and Directional Posteriors for Bayesian Neural Networks

Oh, Changyong; Adamczewski, Kamil; Park, Mijung

doi:10.48550/arxiv.1902.02603

Cited by 5 publications

(7 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Analogous to the likelihood function, variational dropout [37], which is used in this work, approximates the posteriors p(θ|D) by Gaussian distributions with diagonal covariance, imposing restrictive assumptions of unimodality and statistical independence between neural network weights. More recent advances in the Bayesian deep learning research [110][111][112][113][114][115] could be used to enhance the quality of parameter uncertainty estimation by allowing the model to capture multi-modality and statistical dependencies between parameters.…”

Section: Discussionmentioning

confidence: 99%

Uncertainty Quantification in Deep Learning for Safer Neuroimage Enhancement

Tanno,

Worrall,

Kaden

et al. 2019

Preprint

View full text Add to dashboard Cite

Section: Discussionmentioning

confidence: 99%

Uncertainty Quantification in Deep Learning for Safer Neuroimage Enhancement

Tanno,

Worrall,

Kaden

et al. 2019

Preprint

View full text Add to dashboard Cite

“…Another interesting prior is the radial-directional prior, which disentangles the direction of the weight vector from its length [168]. It is given by…”

Section: Weight-space Priorsmentioning

confidence: 99%

“…where p dir is a distribution over the d-dimensional unit sphere and p rad is a distribution over R. It has been proposed by Oh et al [168] to use the von-Mises-Fisher distribution (see Eq. ( 13)) for p dir and the half-Cauchy (see Eq.…”

Section: Weight-space Priorsmentioning

confidence: 99%

Priors in Bayesian Deep Learning: A Review

Fortuin¹

2021

Preprint

View full text Add to dashboard Cite

While the choice of prior is one of the most critical parts of the Bayesian inference workflow, recent Bayesian deep learning models have often fallen back on vague priors, such as standard Gaussians. In this review, we highlight the importance of prior choices for Bayesian deep learning and present an overview of different priors that have been proposed for (deep) Gaussian processes, variational autoencoders, and Bayesian neural networks. We also outline different methods of learning priors for these models from data. We hope to motivate practitioners in Bayesian deep learning to think more carefully about the prior specification for their models and to provide them with some inspiration in this regard.

show abstract

“…As the vMF distribution is one of the simplest distributions for directional data, mixtures of vMFs have been widely used for clustering directional data [2,8]. For Bayesian inference of neural network weights, vMF distributions are used to model the directional statistics of the weights that are decomposed into radial and directional components [28]. Also, vMF embedding spaces have been studied for deep metric learning [9] since such hypersphere embedding spaces are more desirable than conventional Euclidean spaces when their dimension is large.…”

Section: Learning With Vmf Distributionsmentioning

confidence: 99%

On The Distribution of Penultimate Activations of Classification Networks

Seo,

Lee,

Kwak

2021

Preprint

View full text Add to dashboard Cite

This paper studies probability distributions of penultimate activations of classification networks. We show that, when a classification network is trained with the cross-entropy loss, its final classification layer forms a Generative-Discriminative pair with a generative classifier based on a specific distribution of penultimate activations. More importantly, the distribution is parameterized by the weights of the final fully-connected layer, and can be considered as a generative model that synthesizes the penultimate activations without feeding input data. We empirically demonstrate that this generative model enables stable knowledge distillation in the presence of domain shift, and can transfer knowledge from a classifier to variational autoencoders and generative adversarial networks for class-conditional image generation. INTRODUCTIONDeep neural networks have achieved remarkable success in image classification [10,13]. In most of these networks, an input image is first processed by multiple layers of neurons, whose final output, called penultimate activations, is in turn fed to the last fully connected layer that conducts classification. These networks are typically trained in an end-to-end manner by minimizing the cross-entropy loss. The penultimate activations are the deepest image representation of the networks and have proven to be useful for various purposes besides classification such as image retrieval [45], semantic segmentation [26], and general image description of unseen classes [35].This paper studies the penultimate activations of classifica-* The two authors equally contributed. This work was done while Minkyo Seo was visiting Kakao as a research intern.

show abstract

Radial and Directional Posteriors for Bayesian Neural Networks

Cited by 5 publications

References 24 publications

Uncertainty Quantification in Deep Learning for Safer Neuroimage Enhancement

Uncertainty Quantification in Deep Learning for Safer Neuroimage Enhancement

Priors in Bayesian Deep Learning: A Review

On The Distribution of Penultimate Activations of Classification Networks

Contact Info

Product

Resources

About