2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018
DOI: 10.1109/icassp.2018.8461802
|View full text |Cite
|
Sign up to set email alerts
|

Sequence-Based Multi-Lingual Low Resource Speech Recognition

Abstract: Techniques for multi-lingual and cross-lingual speech recognition can help in low resource scenarios, to bootstrap systems and enable analysis of new languages and domains. End-to-end approaches, in particular sequence-based techniques, are attractive because of their simplicity and elegance. While it is possible to integrate traditional multi-lingual bottleneck feature extractors as front-ends, we show that end-to-end multi-lingual training of sequence models is effective on context independent models trained… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
83
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 87 publications
(87 citation statements)
references
References 22 publications
4
83
0
Order By: Relevance
“…Closer to our work, several works have shared the parameters of a neural network encoder, using feedforward networks [3,1,2] or LSTM [11]. The model is then finetuned on the target low-resource language to fit its specificities [12]. The sampling of the languages during the pre-training can focus on languages related to the targeted language [11].…”
Section: Multilingual Pre-training For Speech Recognitionmentioning
confidence: 99%
“…Closer to our work, several works have shared the parameters of a neural network encoder, using feedforward networks [3,1,2] or LSTM [11]. The model is then finetuned on the target low-resource language to fit its specificities [12]. The sampling of the languages during the pre-training can focus on languages related to the targeted language [11].…”
Section: Multilingual Pre-training For Speech Recognitionmentioning
confidence: 99%
“…Multilingual speech recognition has explored various models to share parameters across languages in different ways. For example, parameters can be shared by using posterior features from other languages [6], applying the same GMM components across different HMM states [7], training shared hidden layers in DNNs [3,4] or LSTM [5], using language independent bottleneck features [8,9]. Some models only share their hidden layers, but use separate output layers to predict their phones [3,4].…”
Section: Related Workmentioning
confidence: 99%
“…Our purpose here is to compute the embedding ei for each corpus Ci where ei is expected to encode information about its corpus Ci. Those embeddings can be jointly trained with the standard multilingual model [5]. First, the embedding matrix E for all corpora is initialized, the i-th row of E is corresponding to the embedding ei of the corpus Ci.…”
Section: Corpus Embeddingmentioning
confidence: 99%
See 1 more Smart Citation
“…The model structure is designed to learn an encoder to extract language-independent representations to build a better acoustic model from many source languages. The success of "language-independent" features to improve ASR performance compared to monolingual training has been shown in many recent works [7,8,9]. Besides directly training the model with all the source languages, there are various variants of MultiASR approaches.…”
Section: Introductionmentioning
confidence: 99%