2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2014
DOI: 10.1109/icassp.2014.6854208
|View full text |Cite
|
Sign up to set email alerts
|

The RWTH English lecture recognition system

Abstract: In this paper, we describe the RWTH speech recognition system for English lectures developed within the Translectures project. A difficulty in the development of an English lectures recognition system, is the high ratio of non-native speakers. We address this problem by using very effective deep bottleneck features trained on multilingual data. The acoustic model is trained on large amounts of data from different domains and with different dialects. Large improvements are obtained from unsupervised acoustic ad… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2015
2015
2018
2018

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 15 publications
0
2
0
Order By: Relevance
“…They are trigram LMs of vocabulary size 64k words. The selection of these LMs has been carried out on the basis of literature, as these models are widely used for large vocabulary speech recognition [30][31][32][33][34]. A description of LMs is given here.…”
Section: Comparison To Existing Lmsmentioning
confidence: 99%
“…They are trigram LMs of vocabulary size 64k words. The selection of these LMs has been carried out on the basis of literature, as these models are widely used for large vocabulary speech recognition [30][31][32][33][34]. A description of LMs is given here.…”
Section: Comparison To Existing Lmsmentioning
confidence: 99%
“…Nowadays, automatic transcriptions of spontaneous speech in moderately noisy environments have reached an accurate enough quality ( [1,2,3]). This quality can be even better when ASR systems are adapted to specific scenarios ( [4,5,6,7,8,9]). Nonetheless, ASR is still far from producing error-free transcriptions and, consequently, its performance in many applications is not completely satisfactory.…”
Section: Introductionmentioning
confidence: 99%