2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2016
DOI: 10.1109/icassp.2016.7472684
|View full text |Cite
|
Sign up to set email alerts
|

On combining i-vectors and discriminative adaptation methods for unsupervised speaker normalization in DNN acoustic models

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2

Citation Types

0
10
0

Year Published

2016
2016
2021
2021

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 20 publications
(10 citation statements)
references
References 22 publications
0
10
0
Order By: Relevance
“…There has been considerable recent work exploring the use of i-vectors [27] for this purpose. I-vectors, which can be regarded as basis vectors spanning a subspace of speaker variability, were first used for adaptation in a GMM framework by Karafiat et al [28], and were later successfully employed for DNN adaptation [29]- [34]. Other examples of auxiliary features include the use of speakerspecific bottleneck features obtained from a speaker separation DNN [35], the use of out-of-domain tandem features [24], as well as speaker codes [36]- [38] in which a specific set of units for each speaker is optimised.…”
Section: Dnn Acoustic Modelling and Adaptationmentioning
confidence: 99%
See 1 more Smart Citation
“…There has been considerable recent work exploring the use of i-vectors [27] for this purpose. I-vectors, which can be regarded as basis vectors spanning a subspace of speaker variability, were first used for adaptation in a GMM framework by Karafiat et al [28], and were later successfully employed for DNN adaptation [29]- [34]. Other examples of auxiliary features include the use of speakerspecific bottleneck features obtained from a speaker separation DNN [35], the use of out-of-domain tandem features [24], as well as speaker codes [36]- [38] in which a specific set of units for each speaker is optimised.…”
Section: Dnn Acoustic Modelling and Adaptationmentioning
confidence: 99%
“…Directly adapting all the weights of a large DNN is computationally and data intensive, and results in large speakerdependent parameter sets. Smaller subsets of the DNN weights may be modified, including biases and slopes of hidden units [7], [34], [43], [44]. Another recently developed approach relies on learning hidden unit contributions (LHUC) for testonly adaptation [10], [11] as well as in a SAT framework [45].…”
Section: Dnn Acoustic Modelling and Adaptationmentioning
confidence: 99%
“…Domain Adaptation (DA) is a particular case of transfer learning that leverages labeled data in the source domain, to learn a classifier for unlabeled data in the target domain [15]. In the recent years, domain adaptation methods have been successfully developed and applied in many practical tasks such as sentiment analysis [28], object recognition in different situations [29,30], facial recognition [31], speech recognition [32], video recognition [33] etc. Generally, it is assumed that the task is the same for different domains, i.e.…”
Section: Introductionmentioning
confidence: 99%
“…Even though DNNs are often trained on large databases to learn variabilities in the input feature space, they are still vulnerable due to mismatched conditions between training and testing data and lead to significant performance degradations [3]. Speaker adaptation is commonly used in DNNs to reduce the mismatch conditions occurring due to speaker variabilities among training and testing data [4]. Adaptation to acoustic conditions has also been shown to improve DNN‐based acoustic modelling performance [3].…”
Section: Introductionmentioning
confidence: 99%
“…The utterance-dependent (UD) transformation matrix is also produced in a different way. Furthermore, comparing our approach to most other DNN adaptation methods [4,6], the main differences lie in the structure used to represent the additional information, and in the training procedure. In other methods [9], the speaker information is given as an additional feature to the DNN.…”
mentioning
confidence: 99%