2014
DOI: 10.1587/transinf.e97.d.285
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Lingual Phone Mapping for Large Vocabulary Speech Recognition of Under-Resourced Languages

Abstract: SUMMARYThis paper presents a novel acoustic modeling technique of large vocabulary automatic speech recognition for under-resourced languages by leveraging well-trained acoustic models of other languages (called source languages). The idea is to use source language acoustic model to score the acoustic features of the target language, and then map these scores to the posteriors of the target phones using a classifier. The target phone posteriors are then used for decoding in the usual way of hybrid acoustic mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
6
1

Relationship

2
5

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 19 publications
0
7
0
Order By: Relevance
“…Multilingual transfer in ASR often relies on using bottle-neck features (Vesely et al, 2012;Vu et al, 2012;Karafiát et al, 2018) and adapting an acoustic model trained on one language to effectively recognize the sounds of other languages (Schultz and Waibel, 2001;Le and Besacier, 2005;Stolcke et al, 2006;Tóth et al, 2008;Plahl et al, 2011;Thomas et al, 2012;Imseng et al, 2014;Do et al, 2014;Heigold et al, 2013;Scharenborg et al, 2017). However, while most work uses less than 10 languages for model training, we include up to 100 languages in training.…”
Section: Related Workmentioning
confidence: 99%
“…Multilingual transfer in ASR often relies on using bottle-neck features (Vesely et al, 2012;Vu et al, 2012;Karafiát et al, 2018) and adapting an acoustic model trained on one language to effectively recognize the sounds of other languages (Schultz and Waibel, 2001;Le and Besacier, 2005;Stolcke et al, 2006;Tóth et al, 2008;Plahl et al, 2011;Thomas et al, 2012;Imseng et al, 2014;Do et al, 2014;Heigold et al, 2013;Scharenborg et al, 2017). However, while most work uses less than 10 languages for model training, we include up to 100 languages in training.…”
Section: Related Workmentioning
confidence: 99%
“…This makes a full fledged acoustic modeling process impractical for under-resourced languages. Popular approaches are to transfer well-trained acoustic models to under-resourced languages such as universal phone set [2,3], tandem approach [4][5][6], subspace GMMs (SGMMs) [7,8], Kullback-Leibler divergence HMM (KL-HMM) [9,10], crosslingual phone mapping [11][12][13] and its extension, contextdependent phone mapping [14][15][16]19].…”
Section: Introductionmentioning
confidence: 99%
“…Transactions on Information and Systems [26], and in the two conferences: IALP 2012 [27] and ISCSLP 2012 [28]. The work on applying deep neural networks on monolingual speech recognition and cross-lingual phone mapping is published in the two conferences:…”
Section: Contributionsmentioning
confidence: 99%
“…The results in this chapter have been published in: IEICE Transactions on Information and Systems [26], IALP 2012 [27], and ISCSLP 2012 [28].…”
Section: Cross-lingual Phone Mappingmentioning
confidence: 99%
See 1 more Smart Citation