2006
DOI: 10.1109/tasl.2006.878264
|View full text |Cite
|
Sign up to set email alerts
|

Progress in the CU-HTK broadcast news transcription system

Abstract: Abstract-Broadcast News (BN) transcription has been a challenging research area for many years. In the last couple of years the availability of large amounts of roughly transcribed acoustic training data and advanced model training techniques has offered the opportunity to greatly reduce the error rate on this task. This paper describes the design and performance of BN transcription systems which make use of these developments. First the effects of using lightly-supervised training data and advanced acoustic m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
64
0
1

Year Published

2007
2007
2015
2015

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 77 publications
(66 citation statements)
references
References 30 publications
1
64
0
1
Order By: Relevance
“…Interestingly the V5 SPron system when combined with the V3 MPron system also gave good gains. This shows that the SPron is complementary to the MPron system as observed for English systems in, for example, [7]. However on bcad06 using the standard G0+V3 system outperforms the V3+V5 system.…”
Section: Single Pronunciation Modellingmentioning
confidence: 52%
“…Interestingly the V5 SPron system when combined with the V3 MPron system also gave good gains. This shows that the SPron is complementary to the MPron system as observed for English systems in, for example, [7]. However on bcad06 using the standard G0+V3 system outperforms the V3+V5 system.…”
Section: Single Pronunciation Modellingmentioning
confidence: 52%
“…We build Model M on each source and interpolate them using the same weights as in the baseline, yielding a WER of 12.3%, or a gain of 0.7% absolute. As far as we know, this is the best single-system result for this data set, surpassing the previous best of 12.6% [10]. On the held-out set, the perplexity is reduced from 133 for the baseline to 121.…”
Section: B English Broadcast News Transcriptionmentioning
confidence: 63%
“…Each audio file was segmented and the segments were clustered for speaker adaptation using the segmenter and clusterer part of the Cambridge University RT-04 transcriptipn system [11]. Each speech segment was decoded using a two-pass recognition framework [12,11] including speaker adaptation, with the decoding employing a biased language model (LM) and tandem-SAT acoustic models trained on a subset of the training dataset. The biased LM was initially trained on the subtitle transcripts and interpolated with the overall language model, with a 0.9/0.1 interpolation weight ratio, resulting in an interpolated LM biased to the transcripts.…”
Section: Metadatamentioning
confidence: 99%