2005
DOI: 10.1007/11551874_39
|View full text |Cite
|
Sign up to set email alerts
|

Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
37
0

Year Published

2006
2006
2019
2019

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 64 publications
(37 citation statements)
references
References 5 publications
0
37
0
Order By: Relevance
“…Following [19], an HMM was built for each query using a dictionary, and the log likelihood ratio between a query model and a background model (free phone loop) was computed. The development forced alignment and graphemic transcription of queries to obtain reference pronunciation of each query were used.…”
Section: Open Systemsmentioning
confidence: 99%
“…Following [19], an HMM was built for each query using a dictionary, and the log likelihood ratio between a query model and a background model (free phone loop) was computed. The development forced alignment and graphemic transcription of queries to obtain reference pronunciation of each query were used.…”
Section: Open Systemsmentioning
confidence: 99%
“…Parallelly concatenated keyword models are then accompanied by filler and background models (represented by simple phone loops) to create a decoding network, as shown in Fig 1. Likelihoods of the detected keywords are taken from the last state of each keyword model (computed using Viterbi decoder) and compared with the likelihood obtained from the background model. Confidence Score (CS) of each detected keyword is then given as a log-likelihood ratio between these two likelihoods [1]. Such the acoustic KWS is denoted as KW S 1xRT acoust and is able to run much faster than LVCSR-KWS (far below 1xRT).…”
Section: Acoustic Kwsmentioning
confidence: 99%
“…Unlike ASR, the acoustic KWS does not need to recognize the whole sentence. The keywords are searched in parameterized spoken data (acoustic features) [1]. Unlike acoustic KWS, Large Vocabulary Continuous Speech Recognition (LVCSR) based KWS systems (often called Spoken Term Detection systems) search keywords in the output of the LVCSR, i.e., word recognition strings -lattices [2].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…The first was a keyword-filler hidden Markov modelbased system with an architecture similar to [15], built on top of a Gaussian mixture model (GMM) based monophone acoustic model (one 8 component GMM per phone with a standard 39-dimensional mel frequency cepstral coefficient observation space, trained on the TIMIT sx/i sentences). Each keyword HMM consisted of one pronunciation path per word with the phone GMMs defining the emission densities.…”
Section: Baselinesmentioning
confidence: 99%