2019
DOI: 10.48550/arxiv.1902.02603
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Radial and Directional Posteriors for Bayesian Neural Networks

Changyong Oh,
Kamil Adamczewski,
Mijung Park

Abstract: We propose a new variational family for Bayesian neural networks. We decompose the variational posterior into two components, where the radial component captures the strength of each neuron in terms of its magnitude; while the directional component captures the statistical dependencies among the weight parameters. The dependencies learned via the directional density provide better modeling performance compared to the widelyused Gaussian mean-field-type variational family. In addition, the strength of input and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
7
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…Analogous to the likelihood function, variational dropout [37], which is used in this work, approximates the posteriors p(θ|D) by Gaussian distributions with diagonal covariance, imposing restrictive assumptions of unimodality and statistical independence between neural network weights. More recent advances in the Bayesian deep learning research [110][111][112][113][114][115] could be used to enhance the quality of parameter uncertainty estimation by allowing the model to capture multi-modality and statistical dependencies between parameters.…”
Section: Discussionmentioning
confidence: 99%
“…Analogous to the likelihood function, variational dropout [37], which is used in this work, approximates the posteriors p(θ|D) by Gaussian distributions with diagonal covariance, imposing restrictive assumptions of unimodality and statistical independence between neural network weights. More recent advances in the Bayesian deep learning research [110][111][112][113][114][115] could be used to enhance the quality of parameter uncertainty estimation by allowing the model to capture multi-modality and statistical dependencies between parameters.…”
Section: Discussionmentioning
confidence: 99%
“…Another interesting prior is the radial-directional prior, which disentangles the direction of the weight vector from its length [168]. It is given by…”
Section: Weight-space Priorsmentioning
confidence: 99%
“…where p dir is a distribution over the d-dimensional unit sphere and p rad is a distribution over R. It has been proposed by Oh et al [168] to use the von-Mises-Fisher distribution (see Eq. ( 13)) for p dir and the half-Cauchy (see Eq.…”
Section: Weight-space Priorsmentioning
confidence: 99%
“…As the vMF distribution is one of the simplest distributions for directional data, mixtures of vMFs have been widely used for clustering directional data [2,8]. For Bayesian inference of neural network weights, vMF distributions are used to model the directional statistics of the weights that are decomposed into radial and directional components [28]. Also, vMF embedding spaces have been studied for deep metric learning [9] since such hypersphere embedding spaces are more desirable than conventional Euclidean spaces when their dimension is large.…”
Section: Learning With Vmf Distributionsmentioning
confidence: 99%