2022
DOI: 10.3390/electronics11071066
|View full text |Cite
|
Sign up to set email alerts
|

A Novel Deep Learning Model Compression Algorithm

Abstract: In order to solve the problem of large model computing power consumption, this paper proposes a novel model compression algorithm. Firstly, this paper proposes an interpretable weight allocation method for the loss between a student network (a network model with poor performance), a teacher network (a network model with better performance) and real label. Then, different from the previous simple pruning and fine-tuning, this paper performs knowledge distillation on the pruned model, and quantifies the residual… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 34 publications
0
7
0
Order By: Relevance
“…Given the audio signal y(t), STFT representation at time t and frequency t, X(t, f ) can be obtained by Equation (1), where w(t) is the window function and τ is the integration variable. The instantaneous phase can be extracted from the STFT as in Equation (2). Then, the stretched phase is calculated as in Equation ( 3), where a is the time stretch factor.…”
Section: Function Main()mentioning
confidence: 99%
See 1 more Smart Citation
“…Given the audio signal y(t), STFT representation at time t and frequency t, X(t, f ) can be obtained by Equation (1), where w(t) is the window function and τ is the integration variable. The instantaneous phase can be extracted from the STFT as in Equation (2). Then, the stretched phase is calculated as in Equation ( 3), where a is the time stretch factor.…”
Section: Function Main()mentioning
confidence: 99%
“…Although DL models are complex and resource-intensive, these models can be leveraged to fit into resource-constrained environments by techniques such as model pruning, quantization, and knowledge distillation [ 2 , 3 ]. These techniques, aimed at reducing both the size and computational complexity of DL models, come with a challenge in achieving an optimal trade-off between complexity reduction and sustained performance.…”
Section: Introductionmentioning
confidence: 99%
“…With limited resources and funding, the optimization of CNNs (Habib & Qureshi, 2022) is crucial to their scalability in OoCs. To address the issue of scalability, customized design of models, compression of model designs (Zhao et al, 2022), and iterative improvements to models over time as deep learning progresses are all actions that could be taken in order to increase computational efficiency.…”
Section: Scalabilitymentioning
confidence: 99%
“…However, the enormous numbers of computations and parameters of CNNs hinder further development. Thus, it is not practical to deploy heavy CNNs on resource-constrained computing devices, such as embedded systems and mobile devices [14][15][16]. To address the problems, substantial research efforts have been devoted to compression techniques: channel pruning [17][18][19][20], low-rank decomposition [21][22][23], and weight quantization [24,25].…”
Section: Introductionmentioning
confidence: 99%