2020
DOI: 10.1109/access.2020.3023783
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Amdo-Tibetan Speech Recognition Based on Knowledge Transfer

Abstract: The end-to-end speech recognition technology solves the problem that each component is independent and models cannot be jointly optimized in the traditional speech recognition model. It incorporates such components as the acoustic model, language model, and decoding unit of the hybrid model into a single neural network, that can avoid the inherent defects of multiple modules and greatly reduces the complexity of the speech recognition model. In this research, an Amdo-Tibetan speech recognition system is constr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(2 citation statements)
references
References 39 publications
0
2
0
Order By: Relevance
“…To that end, the character error rate (CER) has been used instead of WER, although the evaluation principle remains the same. Besides, WER is also called phoneme error rate (PER) in schemes that adopt phoneme as a unit of measure rather than a word [53][54][55]. The word recognition rate (WRR) is a version of WER that may be used to assess ASR performance such that WRR = 1 − WER and N − (S + D) is the total number of successfully predicted words [25].…”
Section: Evaluation Criteria In Asrmentioning
confidence: 99%
“…To that end, the character error rate (CER) has been used instead of WER, although the evaluation principle remains the same. Besides, WER is also called phoneme error rate (PER) in schemes that adopt phoneme as a unit of measure rather than a word [53][54][55]. The word recognition rate (WRR) is a version of WER that may be used to assess ASR performance such that WRR = 1 − WER and N − (S + D) is the total number of successfully predicted words [25].…”
Section: Evaluation Criteria In Asrmentioning
confidence: 99%
“…In practical application scenarios, speech signals will inevitably be disturbed by many interference factors such as noise, echo, and reverberation. Therefore, speech enhancement technology has been widely used in household appliances, communications, speech recognition, automotive electronics, hearing aids, and other fields [1][2][3]. Traditional speech enhancement methods, based on signal processing and statistical modeling, have good performance for stationary noise.…”
Section: Introductionmentioning
confidence: 99%