2016
DOI: 10.1186/s13636-016-0088-7
|View full text |Cite
|
Sign up to set email alerts
|

Wise teachers train better DNN acoustic models

Abstract: Automatic speech recognition is becoming more ubiquitous as recognition performance improves, capable devices increase in number, and areas of new application open up. Neural network acoustic models that can utilize speaker-adaptive features, have deep and wide layers, or more computationally expensive architectures, for example, often obtain best recognition accuracy but may not be suitable for the given budget of computational and storage resources or latency required by the deployed system. We explore a str… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 18 publications
(18 citation statements)
references
References 33 publications
(51 reference statements)
0
18
0
Order By: Relevance
“…Earlier studies investigate student-teacher framework to enable model compression and encapsulating the information of multiple models into a single network [12,15]. This approach was also found beneficial for semi-supervised training exploiting untranscribed training data [14], although the investigation is limited to the in-domain data matching the initial transcribed speech used for supervised training. Semi-supervised training [16,17,18,19] has been popular for low-resource tasks where cheap-to-obtain untranscribed data is readily available.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Earlier studies investigate student-teacher framework to enable model compression and encapsulating the information of multiple models into a single network [12,15]. This approach was also found beneficial for semi-supervised training exploiting untranscribed training data [14], although the investigation is limited to the in-domain data matching the initial transcribed speech used for supervised training. Semi-supervised training [16,17,18,19] has been popular for low-resource tasks where cheap-to-obtain untranscribed data is readily available.…”
Section: Introductionmentioning
confidence: 99%
“…We exploit the framework of student-teacher DNN training that has been recognized promising for knowledge transfer and distillation [11,12,13,14]. The basic idea of the studentteacher DNN training is that a teacher DNN (often trained with hard targets) provides soft targets for training a student DNN.…”
Section: Introductionmentioning
confidence: 99%
“…We have used the trainlm training function because it is the fastest back propagation algorithm. It is based on the Levenberg-Marquardt optimization algorithm [31]- [34].…”
Section: Experiments and Implementationmentioning
confidence: 99%
“…Soft targets have been previously used for DNN knowledge distillation by model compression [4,5] and knowledge transfer [6,7]. In current work, they refer to the senone posteriors probabilities generated by an already trained DNN model on close-talk speech data.…”
Section: Introductionmentioning
confidence: 99%