2021
DOI: 10.1016/j.neucom.2021.04.117
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian tensorized neural networks with automatic rank selection

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 25 publications
(15 citation statements)
references
References 41 publications
0
15
0
Order By: Relevance
“…3) Extensions to Other Tensor Decomposition Models: Similar ideas have been applied to other tensor decomposition models including Tucker decomposition (TuckerD) [47] and tensor train decomposition (TTD) [48]- [50]. In these works, one first assumes an over-parametrized model by setting the model configuration parameters (e.g., multi-linear ranks in TuckerD and TT ranks in TTD) to be large numbers, and then impose GSM prior on the associated model parameters to control the model complexity, see detailed discussions in [47]- [50].…”
Section: Sparsity-aware Modeling For Tensor Decompositionsmentioning
confidence: 99%
See 1 more Smart Citation
“…3) Extensions to Other Tensor Decomposition Models: Similar ideas have been applied to other tensor decomposition models including Tucker decomposition (TuckerD) [47] and tensor train decomposition (TTD) [48]- [50]. In these works, one first assumes an over-parametrized model by setting the model configuration parameters (e.g., multi-linear ranks in TuckerD and TT ranks in TTD) to be large numbers, and then impose GSM prior on the associated model parameters to control the model complexity, see detailed discussions in [47]- [50].…”
Section: Sparsity-aware Modeling For Tensor Decompositionsmentioning
confidence: 99%
“…Since the KL divergence is nonnegative, the equality in (48) holds if and only if it is equal to zero.…”
Section: A Evidence Maximization Frameworkmentioning
confidence: 99%
“…One of the most common method is the Tucker factorization (Cohen et al, 2016), which can generate high-quality DNNs when compressing fully-connected layers. Tensor Train (TT) and Tensor Ring (TR) decomposition techniques have been recently studied in the context of DNNs (Hawkins & Zhang, 2019;Wang et al, 2018). But previous work has explored the accuracy trade-off for fully-connected and convolution layers only.…”
Section: Related Workmentioning
confidence: 99%
“…This manuscript is an extended version of our recent work (Hawkins and Zhang, 2021), which reported SVGD training for Bayesian tensorized neural networks. Our manuscript extends Hawkins and Zhang (2021) in the following ways 1. In our previous work, we tested only one Bayesian sampler (SVGD).…”
Section: Introductionmentioning
confidence: 99%