2019
DOI: 10.3390/sym11020179
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Language End-to-End Speech Recognition Research Based on Transfer Learning for the Low-Resource Tujia Language

Abstract: To rescue and preserve an endangered language, this paper studied an end-to-end speech recognition model based on sample transfer learning for the low-resource Tujia language. From the perspective of the Tujia language international phonetic alphabet (IPA) label layer, using Chinese corpus as an extension of the Tujia language can effectively solve the problem of an insufficient corpus in the Tujia language, constructing a cross-language corpus and an IPA dictionary that is unified between the Chinese and Tuji… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 21 publications
(15 citation statements)
references
References 36 publications
0
15
0
Order By: Relevance
“…Recent developments in the automatic speech recognition field, mainly fueled by the advancements made by deep learning methods, have shown impressive results even for under-resourced languages [9,10]. Based on these results we believe that the state-of-the-art deep learning methods can be used to create a high accuracy ASR system for the Lithuanian language.…”
Section: Introductionmentioning
confidence: 77%
“…Recent developments in the automatic speech recognition field, mainly fueled by the advancements made by deep learning methods, have shown impressive results even for under-resourced languages [9,10]. Based on these results we believe that the state-of-the-art deep learning methods can be used to create a high accuracy ASR system for the Lithuanian language.…”
Section: Introductionmentioning
confidence: 77%
“…To deal with this language we chose a program called PyCharm, a development program that supports Python language. We relied on GPU for training which has a great ability to handle matrixes and repetitive operations (Yu et al, 2019).…”
Section: Resultsmentioning
confidence: 99%
“…Wang et al [44] combined CTC with Tibetan linguistics knowledge and used bound triphones as a modeling unit to solve the problem of Tibetan acoustic modeling under resource constraints, which made the recognition rate based on the end-to-end acoustic model method exceed the speech recognition system based on BLSTM-HMM. Yu et al [21], [32] used the BLSTM-CTC joint model to achieve phonemic level speech recognition for a few hours of data.…”
Section: B Acoustic Models With End-to-end Structurementioning
confidence: 99%
“…Finally, the transferred model is retrained using target low-resource language. In this case, the pre-training model can be regarded as a feature extractor [32]. In practice, the pre-training model can be used in combination with multilingual training.…”
Section: ) Transfer Learningmentioning
confidence: 99%