Three perceptual experiments were conducted to test the relative importance of vowels versus consonants to recognition of fluent speech. Sentences were selected from the TIMIT corpus to obtain approximately equal numbers of vowels and consonants within each sentence and equal durations across the set of sentences. In experiments 1 and 2, subjects listened to (a) unaltered TIMIT sentences, (b) sentences in which all of the vowels were replaced by noise, or (c) sentences in which all of the consonants were replaced by noise. The subjects listened to each sentence five times, and attempted to transcribe what they heard. The results of these experiments show that recognition of words depends more upon vowels than consonants—about twice as many words are recognized when vowels are retained in the speech. The effect was observed when occurrences of [l], [r], [w], [y] [m], and [n] were included in the sentences (experiment 1) or replaced by noise (experiment 2). Experiment 3 tested the hypothesis that vowel boundaries contain more information about the neighboring consonants than vice versa.
This paper presents a simple algorithm which converts any context-free grammar (without c-producfions) into a connectionist network which parses strings (of arbitrary but fixed maximum length) in the language defined by that grammar. The network is fast and deterministic. Some modifications of the network are also explored, including parsing near misses, disambiguating and learning new productions dynamically.
Massively parallel (connectionist) computation is receivin~widespread attention. Both the theory of connectionist computation and its applications to a broad range of problems are advancing rapidly. Somewhat less attention has been paid to the experimental methodologies of designing and testing massively parallel networks. This paper describes a coordinated set of specification, simulation and monitoring tools that have proven to be quite useful in our research. Also included are an overview of the structured connectionist paradigm and several sample applications.
We describe EAR, an English Alphabet Recognizer that performs speakerindependent recognition of letters spoken in isolation. During recognition, (a) signal processing routines transform the digitized speech into useful representations, (b) rules are applied to the representations t o locate segment boundaries, (c) feature measurements are computed on the speech segments, and (d) a neural network uses the feature measurements to classify the letter. The system was trained on one token of each letter from 120 speakers. Performance was 95% when tested on a new set of 30 speakers. Performance was 96% when tested on a second token of each letter from the original 120 speakers (multi-speaker recognition). EAR is the first fully automatic, neural-network based, speaker-independent spoken letter recognition system. The recognition accuracy is 6% higher than previously reported systems (half the error rate). We attribute the high level of performance to accurate and explicit phonetic segmentation, the use of speech knowledge t o select features that measure the important linguistic information, and the ability of the neural classifier t o model the variability of the data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.