2021
DOI: 10.48550/arxiv.2108.11000
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Layer Adaptive Node Selection in Bayesian Neural Networks: Statistical Guarantees and Implementation Details

Abstract: Sparse deep neural networks have proven to be efficient for predictive model building in large-scale studies. Although several works have studied theoretical and numerical properties of sparse neural architectures, they have primarily focused on the edge selection. Sparsity through edge selection might be intuitively appealing; however, it does not necessarily reduce the structural complexity of a network. Instead pruning excessive nodes in each layer leads to a structurally sparse network which would have low… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…Baselines. Our baselines include the frequentist model of a deterministic deep neural network (trained with SGD), BNN [39], spike-and-slab BNN for node sparsity [30], single forward pass ensemble models including rank-1 BNN Gaussian ensemble [10], MIMO [11], and EDST ensemble [12], multiple forward pass ensemble methods: DST ensemble [12] and Dense ensemble of deterministic neural networks. For fair comparison, we keep the training hardware, environment, data augmentation, and training schedules of all the models same.…”
Section: Experiments: Results and Analysismentioning
confidence: 99%
See 4 more Smart Citations
“…Baselines. Our baselines include the frequentist model of a deterministic deep neural network (trained with SGD), BNN [39], spike-and-slab BNN for node sparsity [30], single forward pass ensemble models including rank-1 BNN Gaussian ensemble [10], MIMO [11], and EDST ensemble [12], multiple forward pass ensemble methods: DST ensemble [12] and Dense ensemble of deterministic neural networks. For fair comparison, we keep the training hardware, environment, data augmentation, and training schedules of all the models same.…”
Section: Experiments: Results and Analysismentioning
confidence: 99%
“…Dynamic sparsity learning for our sequential ensemble of sparse BNNs is achieved via spike-and-slab prior: a Dirac spike (δ 0 ) at 0 and a uniform slab distribution elsewhere [40]. We adopt the sparse BNN model of [30] to achieve the structural sparsity in Bayesian neural networks. Specifically a common indicator variable z is used for all the weights incident on a node which helps to prune away the given node while training.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations