2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2012
DOI: 10.1109/icassp.2012.6289082
|View full text |Cite
|
Sign up to set email alerts
|

Resource configurable spoken query detection using Deep Boltzmann Machines

Abstract: In this paper we present a spoken query detection method based on posteriorgrams generated from Deep Boltzmann Machines (DBMs). The proposed method can be deployed in both semi-supervised and unsupervised training scenarios. The DBM-based posteriorgrams were evaluated on a series of keyword spotting tasks using the TIMIT speech corpus. In unsupervised training conditions, the DBM-approach improved upon our previous best unsupervised keyword detection performance using Gaussian mixture model-based posteriorgram… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
51
0

Year Published

2013
2013
2019
2019

Publication Types

Select...
3
3
2

Relationship

2
6

Authors

Journals

citations
Cited by 49 publications
(53 citation statements)
references
References 8 publications
2
51
0
Order By: Relevance
“…For the case where we use none of the annotations, we train a 61-mixture GMM Fig. 2: System performance with respect to different percentage of annotations used for back propagation first and then assign the labels according to the index of the mixture with the highest posterior probability [7].…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…For the case where we use none of the annotations, we train a 61-mixture GMM Fig. 2: System performance with respect to different percentage of annotations used for back propagation first and then assign the labels according to the index of the mixture with the highest posterior probability [7].…”
Section: Resultsmentioning
confidence: 99%
“…As Zhang et. al [7] have reported that using DBN posteriorgrams greatly improves the performance on the task of keyword spotting, which is essentially a comparison task, we would also like to explore the use of DBN posteriorgrams in our system, with various settings for training.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Parallelizing the model estimation procedure on a multicore system or a graphics processor unit using techniques similar to [22] would allow for larger audio corpora to be analyzed. The use of new unsupervised and semi-supervised acoustic models [17,24] may prove to be useful for improving the performance of the spoken term discovery procedure. Finally, reworking the models presented in this paper into a fully Bayesian framework would expand the flexibility of the models and allow their size to automatically scale with the amount of data used.…”
Section: Discussionmentioning
confidence: 99%
“…Due to distribution learning capability of GaussianBernoulli Restricted Boltzmann Machines (GBRBM), RBMbased posteriorgrams were found to be comparable to Gaussian posteriorgrams. Restricted Boltzmann Machines (RBM) and Deep Belief Network (DBN) were used for QbE-STD tasks as an alternative to Gaussian posteriorgrams [9], [10].…”
Section: Introductionmentioning
confidence: 99%