2021
DOI: 10.1016/j.neucom.2020.11.025
|View full text |Cite
|
Sign up to set email alerts
|

Mutual-learning sequence-level knowledge distillation for automatic speech recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 20 publications
(4 citation statements)
references
References 46 publications
0
4
0
Order By: Relevance
“…Knowledge distillation has proven effective in various domains, such as natural language processing[45, 46, 47, 48], computer vision[41, 49, 50, 51], and speech recognition[52, 53, 54, 55]. Its versatility stems from its capacity to distill the rich knowledge captured by a complex model into a more compact representation, which is suitable for deployment in environments with limited resources.…”
Section: Discussionmentioning
confidence: 99%
“…Knowledge distillation has proven effective in various domains, such as natural language processing[45, 46, 47, 48], computer vision[41, 49, 50, 51], and speech recognition[52, 53, 54, 55]. Its versatility stems from its capacity to distill the rich knowledge captured by a complex model into a more compact representation, which is suitable for deployment in environments with limited resources.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, end-to-end models were considered in deep learning for ASR. However, these algorithms still are not suitable for real-time ASR applications because of their large model sizes and computation complexity [65]. On the contrary, SNNs models and algorithms offer challenging tools for temporal tasks such as ASR because they can directly deal with temporal features besides spatial features.…”
Section: Related Workmentioning
confidence: 99%
“…But this method requires a large amount of corpus for training, which is time-consuming and not accurate. Finally, the human-machine interaction method of deep learning is utilized, which automatically extracts speech and text features through deep learning models, effectively improving the performance of language recognition [4]. With the increasing demand of people, the human-machine environment has become even more harsh, so simple deep learning models can no longer meet high requirements for completing human-machine interaction tasks.…”
Section: Introductionmentioning
confidence: 99%