ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8683422
|View full text |Cite
|
Sign up to set email alerts
|

Improving Noise Robustness of Automatic Speech Recognition via Parallel Data and Teacher-student Learning

Abstract: For real-world speech recognition applications, noise robustness is still a challenge. In this work, we adopt the teacherstudent (T/S) learning technique using a parallel clean and noisy corpus for improving automatic speech recognition (ASR) performance under multimedia noise. On top of that, we apply a logits selection method which only preserves the k highest values to prevent wrong emphasis of knowledge from the teacher and to reduce bandwidth needed for transferring data. We incorporate up to 8000 hours o… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
38
0
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
3
2

Relationship

1
7

Authors

Journals

citations
Cited by 40 publications
(40 citation statements)
references
References 26 publications
1
38
0
1
Order By: Relevance
“…The gains we observe using T/S training are in the same range as the results reported in [15], and our experiments demonstrate that we can distill information from a single channel teacher model to a multi-channel student model and learn the front-end components in a data-driven manner. The second observation we make is the WER improvements with pre-training improve performance even after T/S training.…”
Section: Results With T/s Trainingsupporting
confidence: 78%
See 2 more Smart Citations
“…The gains we observe using T/S training are in the same range as the results reported in [15], and our experiments demonstrate that we can distill information from a single channel teacher model to a multi-channel student model and learn the front-end components in a data-driven manner. The second observation we make is the WER improvements with pre-training improve performance even after T/S training.…”
Section: Results With T/s Trainingsupporting
confidence: 78%
“…We also soften the senone logits output by the teacher using temperature T . For all our experiments, we use T = 2 since that was found to be the optimal parameter in [15].…”
Section: Teacher-student Trainingmentioning
confidence: 99%
See 1 more Smart Citation
“…Teacher-student (T/S) learning [1,2] has been widely applied to a variety of deep learning tasks in speech, language and image processing including model compression [1,2], domain adaptation [3,4,5], small-footprint natural machine translation (NMT) [6], low-resource NMT [7], far-field automatic speech recognition (ASR) [8,9], lowresource language ASR [10] and neural network pre-training [11]. T/S learning falls in the category of transfer learning, where the network of interest, as a student, is trained by mimicking the behavior of a well-trained network, as a teacher, in the presence of the same or stereo training samples.…”
Section: Introductionmentioning
confidence: 99%
“…The AM is based on the standard HMM/deep learning hybrid, and we summarize details relevant to this paper in Section II-B. Other aspects of this system have been described elsewhere ( [25], [26], [27], [28]). The LM [29] estimates the a priori probability that the speaker will utter a sequence of words.…”
Section: Introductionmentioning
confidence: 99%