2020
DOI: 10.1016/j.neunet.2020.05.034
|View full text |Cite
|
Sign up to set email alerts
|

Block-term tensor neural networks

Abstract: Deep neural networks (DNNs) have achieved outstanding performance in a wide range of applications, e.g., image classification, natural language processing, etc. Despite the good performance, the huge number of parameters in DNNs brings challenges to efficient training of DNNs and also their deployment in low-end devices with limited computing resources. In this paper, we explore the correlations in the weight matrices, and approximate the weight matrices with the low-rank block-term tensors. We name the new co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
6
2
2

Relationship

0
10

Authors

Journals

citations
Cited by 22 publications
(5 citation statements)
references
References 59 publications
0
5
0
Order By: Relevance
“…Employing TT on Conv layers is introduced in [38], where the 4D kernel tensor should be reshaped to size of and the input feature maps are reshaped to . In the feedforward phase, the tensorized input will be contracted with each TT-core one by one.…”
Section: Fig 4 a Fourth Order Tensor In Tt Formatmentioning
confidence: 99%
“…Employing TT on Conv layers is introduced in [38], where the 4D kernel tensor should be reshaped to size of and the input feature maps are reshaped to . In the feedforward phase, the tensorized input will be contracted with each TT-core one by one.…”
Section: Fig 4 a Fourth Order Tensor In Tt Formatmentioning
confidence: 99%
“…In this setting, other types of decompositions have also been explored, including Tensor-Ring [131] and Block-Term Decomposition [132]. This strategy has also been extended to parametrize other types of layers [133].…”
Section: A Parameterizing Fully-connected Layersmentioning
confidence: 99%
“…The groundbreaking works (Novikov et al, 2015;Garipov et al, 2016) demonstrate that the loworder parameter structures can be efficiently compressed via tensor-train decomposition (Oseledets, 2011) by first reshaping the structures into a higher-order tensor. This idea is later extended in two directions: tensor-train decomposition is used to compress LSTM/GRU layers in recurrent neural networks (Yang et al, 2017), higher-order recurrent neural networks (Yu et al, 2017;Su et al, 2020), and 3D convolutional layers (Wang et al, 2020); other decompositions are also explored for better compression, such as tensor-ring decomposition (Zhao et al, 2016) and blockterm decomposition (Ye et al, 2020). et al (2015) proposed to train the student network with the teacher network's logits (the vector before the softmax layer).…”
Section: Model Compression Of Neural Networkmentioning
confidence: 99%