2019
DOI: 10.1016/j.asoc.2019.03.057
|View full text |Cite
|
Sign up to set email alerts
|

Compression of recurrent neural networks for efficient language modeling

Abstract: Recurrent neural networks have proved to be an effective method for statistical language modeling. However, in practice their memory and run-time complexity are usually too large to be implemented in real-time offline mobile applications. In this paper we consider several compression techniques for recurrent neural networks including Long-Short Term Memory models. We make particular attention to the high-dimensional output problem caused by the very large vocabulary size. We focus on effective compression meth… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
22
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 33 publications
(22 citation statements)
references
References 18 publications
0
22
0
Order By: Relevance
“…We also compared our approach to other compression techniques: matrix decompositionbased (Grachev et al, 2019) and VDbased (Chirkova et al, 2018). For the last one we used a similar model: a network with one LSTM layer of 256 hidden units.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…We also compared our approach to other compression techniques: matrix decompositionbased (Grachev et al, 2019) and VDbased (Chirkova et al, 2018). For the last one we used a similar model: a network with one LSTM layer of 256 hidden units.…”
Section: Methodsmentioning
confidence: 99%
“…The weight gradually increases from zero to one during the first several epochs of training. This technique allows achieving better final performance of the model because such a train- 13.4 / 0.14 32.1% / 97.8% 91.84 27.2% LR for Softmax, (Grachev et al, 2019) 14.5 / 1.19 26.8 % / 81.7 % 84.12 N/A TT for Softmax, (Grachev et al, 2019) 14. (Chirkova et al, 2018) 3.2 / 0.12 43.3 % / 95.5 % 109.2 N/A DSVI-ARD (Ours)…”
Section: Training and Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…Our study concentrated on the ability to estimate dose in heterogeneous geometries, and no effort was made in improving the model efficiency. Various model compression techniques, for example, pruning, quantization, and tensor decomposition methods (achieving low-rank structures in the weight matrices), [51][52][53] may substantially lower the number of parameters in fully connected layers. 54,55 The efficiency of the model can be further enhanced through fine-tuning of the model architecture.…”
Section: In This Paper We Have Demonstrated the General Feasibility mentioning
confidence: 99%
“…-Low rank factorization: [10,36] -Factorized embedding parameterization: [19] -Block-Term tensor decomposition: [23,38] -Singular Value Decomposition: [37] -Joint factorization of recurrent and inter-layer weight matrices: [28] -Tensor train decomposition: [10,17] -Sparse factorization: [6] • [11] • Applications: In this section, we will discuss application and success of various model compression methods across various popular NLP tasks like Language modeling, Machine translation, Summarization, Sentiment analysis, Question answering, Natural language inference, Paraphrasing, Image captioning, Handwritten character recognition. • Summary and future trends.…”
Section: Tutorial Outlinementioning
confidence: 99%