This paper describes experiments in which word recognition is based on comparing the projections of input words on an orthogonal basis with those of a stored library of words. An initial orthogonal basis is determined from the generalized spectrum of short time segments selected from a vocabulary of ten words. The initial basis is optimized by minimizing the complementary error energy. By projecting a spoken word onto the optimum orthogonal basis, a sequence of numbers is generated to represent the word. By correlating the absolute values of the sequence with those of a stored library of words, the spoken word is identifed. The percent of correct recognition varies from 71.6 to 96.6 percent for two speakers.Techniques are developed to improve the recognition scores and to reduce the lengthy computer processing time and large storage requirement. First a master template is made for each word by averaging six templates for the particular word. For one speaker the percent of correct recognition increases to 100 percent when incoming words are compared against the master templates. For a second speaker, the recognition rates improve significantly and vary between 93 and 98 percent when the master templates are used. To further improve the recognition process, the feasibility of grouping words into several classes is demonstrated. The classifications are based on the locations of formant regions and the time durations of each spoken word,
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.