Interspeech 2019 2019
DOI: 10.21437/interspeech.2019-3246
|View full text |Cite
|
Sign up to set email alerts
|

Exploiting Semi-Supervised Training Through a Dropout Regularization in End-to-End Speech Recognition

Abstract: In this paper, we explore various approaches for semisupervised learning in an end-to-end automatic speech recognition (ASR) framework. The first step in our approach involves training a seed model on the limited amount of labelled data. Additional unlabelled speech data is employed through a data-selection mechanism to obtain the best hypothesized output, further used to retrain the seed model. However, uncertainties of the model may not be well captured with a single hypothesis. As opposed to this technique,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
8
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
7

Relationship

0
7

Authors

Journals

citations
Cited by 11 publications
(8 citation statements)
references
References 14 publications
0
8
0
Order By: Relevance
“…Experiments on the Fisher English shows that the proposed approach can further improve the WER over the regular semisupervised training framework. While this chapter primarily focused on LF-MMI training, it is clear that the idea can be further extended to other frameworks such as end-to-end based semi-supervised training [Dey et al, 2019]. Had the time allowed, it would have made sense to investigate the proposed semi-supervised training approach in cross-lingual adaptation scenarios.…”
Section: Resultsmentioning
confidence: 99%
“…Experiments on the Fisher English shows that the proposed approach can further improve the WER over the regular semisupervised training framework. While this chapter primarily focused on LF-MMI training, it is clear that the idea can be further extended to other frameworks such as end-to-end based semi-supervised training [Dey et al, 2019]. Had the time allowed, it would have made sense to investigate the proposed semi-supervised training approach in cross-lingual adaptation scenarios.…”
Section: Resultsmentioning
confidence: 99%
“…Another way of using unlabeled speech is to generate pseudo-labels for them through a seed ASR model. Further sample selecting and filtering are applied to get informative ones [15,16].…”
Section: Related Workmentioning
confidence: 99%
“…In [18], an N -best list of ASR hypotheses is used by summing over the weighted losses of multiple pseudo-labels for a single speech utterance, where weights are estimated from scores of a strong language model (LM). In [19], multiple pseudo-labels are generated for each unlabeled speech utterance using different dropout settings, which are used for selftraining with the purpose of capturing ASR uncertainties.…”
Section: Introductionmentioning
confidence: 99%