A basic assumption for most current speech recognition system is that the speech to be recognized consist solely of words from a predefined vocabulary. For speech recognition applications in the telephone network, it is naive to assume that users will adhere strictly to this protocol. In Wilpon. et al [1.2]. a hidden Markov model based key wordspotting algoridutl was presented, which can recognize key words from a pre-defined vocabulary list spoken in an unconstrained fashion. In OUT current work, we explore several improvements to the feature analysis used to represent the speech signal and modeling techniques used to train our system.Common measures for evaluating recognition systems are essential for comparing results and assessing claims. However, no such measure has yet been developed for key wordspotting systems. In our psper, we will discuss several task domain issues, which influence evaluation criteria.Scoring methodologies that fit our research goals will be described. We will present results from extensive evaluations on three speaker independent databases: the 20 word vocabulary Stonehenge Road Rally database, distributed by the National Security Agency (NSA), a five word vocabulary used to automate operatorassisted calls, and a three word Spanish vocabulary that is currently being trialed in Spain's telephone network. Currently, we are achieving recognition accuracies ranging from 99.9% on the Spanish database to 74% (with 8.8 FA/H/W) on the Stonehenge task.
Serum concentrations of GIP and PYY on postnatal 7 are independently associated with time to full enteral feedings. The link between serum gut hormone concentrations and time to full enteral feedings is not fully mediated by nutritional factors, suggesting an independent mechanism underlying the influence of gut hormones on feeding tolerance and time to full enteral feedings.
We describe a system that automatically acquires a language model for a particular task from semantic-level information. This is in contrast to systems with predefined vocabulary and syntax. The purpose of the system is to map spoken or typed input into a machine action. To accomplish this task we use a medium-grain neural network. We introduce a novel adaptive training procedure for estimating the connection weights, which has the advantages of rapid, single-pass and orderinvariant learning.The resulting weights have information-theoretic significence, and do not require gradient search techniques for their estimation.We experimentally evaluate the system on three text-based tasks. The first is a three-class inward-call manager with an acquired vocabulary of over 1600 words. The second is a fifteen-action subset of the DARPA Resource Manager, with an acquired vocabulary of over 700 words. The third is to discriminate between idiomatic phrases meaning 'yes' or 'no'.
Tbis paper reports the results of our experiments on speaker independent phonetic transcription of fluent speech.Our acoustidphonetic model is a 38505 parameter continuously variable duration hidden Markov model which allows us to perform real-time phonetic transcription by means of a modified Viterbi algorithm. The model was trained on 3020 sentences from the TIMIT data base. Testing was performed on the remaining 180 xntences. In a test without lexical or syntactic constraints. we obtained 52% correct phonetic transcription with 12% insertions. We also describe the design of a system for recognition of fluent speech based on our technique for phonetic transcription. L 2 = L 1 Determine L,, the set of current, well-formed partial sentences, W I .~V Z , ..., wl-1, wj. L3 includes W I , w2, ..., W~-~. W , if L~ includes w1, WZ. ..., wj-], the word lattice includes wl, and CONNECT(WJ-~, wl) is true. CONN€CT(w]-,, wI) is true if the word pair ( y I I 3 ) exists and STMT-TIME(W~)-END-TIMLXwj-,) IS within a specific time span.Add well-formed sentences to W . W includes w l , w 2 , ..., IV" if L3 includes i v t l w2, ..., wn, the word pair (w.. SENTENCE-END) exists, and €ND_TIM€(w.) is within some specific time span.If L3 # 0, then Lz = L3, L, = 0, go to 3.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.