On the effects of varying analysis parameters of an LPC based, isolated word recognizer

Rabiner, L. R.; Wilpon, J. G.; Ackenhusen, John G.

doi:10.1121/1.2004972

Cited by 4 publications

(2 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This vocabulary was selected for its high degree of complexity and moderate size [28]. The measured recognition accuracy for this vocabulary has been shown to be relatively low in previous tests [ 101 , [29] . Thus, small differences in system performance can often be reliably measured with a reasonable size set for this vocabulary.…”

Section: Description Of Experiments and Resultsmentioning

confidence: 99%

On the effects of varying filter bank parameters on isolated word recognition

Dautrich¹,

Rabiner²,

Martin³

1983

IEEE Trans. Acoust., Speech, Signal Process.

View full text Add to dashboard Cite

The vast majority of commercially available isolated word recognizers use a filter bank analysis as the front end processing for recognition. It is not well understood how the parameters of different filter banks (eg., number of filters, types of filters, fiiter spacing, etc.) affect recognizer performance. In this paper we present results of performance evaluation of several types of filter bank analyzers in a speaker trained isolated word recognition test using dialed-up telephone line recordings. We have studied both DFT (discrete Fourier transform) and direct form implementations of the filter banks. We have also considered uniform and nonuniform filter spacings. The results indicate that the best performance (highest word accuracy) is obtained by both a 15-channel uniform filter bank and a 13-channel nonuniform fiiter bank (with channels spacing along a critical band scale). The performance of a 7-channel critical band fiiter bank is almost as good as that of the two best filter banks. In comparison to a conventional linear predictive coding (LPC) word recognizer, the performance of the best fdter bank recognizers was, on average, several percent worse than that of an eighth-order LPC-based recognizer. A dicussion as to why some filter banks performed better than others, and why the LPC-based system did the best, is given in this paper. S I. INTRODUCTION INCE the early 1970's, researchers have been working on building machines that have the ability to communicate with man in his natural method of communication. One research area that has developed from this work is that of speech recognition. The general goal of speech recognition is to understand normal human speech and then to be able to perform some task based on this understanding. This is a very natural goal in that it requires machines to adapt to humans rather than vice versa. In this way speech recognition would provide a convenient method of communication with machines (e.g., computers) via terminals and ordinary telephone handsets. Progress has been made toward the general goal of speech recognition by imposing some restrictions on the speech input. These restrictions are usually in the form of limits placed on the vocabulary, the set of allowable users, or the mode of the input. The purpose of this last limitation (probably the most severe one) is to restrict the form of input speech to a set of isolated word commands, instead of continuous speech, in order to achieve reliable recognition. With these restrictions speech recognition has made major strides forward in the past decade and several commerical systems have appeared [ 13-[6]. These systems are predominantly isolated word speaker-trained systems. The availability of these systems has led to an increased interest in the possibility of producing terminal equipment that uses this new technology.

show abstract

Section: Description Of Experiments and Resultsmentioning

confidence: 99%

On the effects of varying filter bank parameters on isolated word recognition

Dautrich¹,

Rabiner²,

Martin³

1983

IEEE Trans. Acoust., Speech, Signal Process.

View full text Add to dashboard Cite

show abstract

“…While Liporace's results are significant in expanding the scope of the reestimation algorithm, the requirements that the observation densities be elliptically symmetric are in many real situations still very restrictive. In particular, useful parametrizations of speech signals, such as reflection coefficients and autocorrelation, have been shown by Gray and Markel 9 and Rabiner et al, 10 respectively, not to exhibit the desired symmetry. This lack of symmetry is often observed even within each state because of the arbitrariness in choosing the number of states for modeling the given process.…”

Section: All θ T=lmentioning

confidence: 99%

Maximum-Likelihood Estimation for Mixture Multivariate Stochastic Observations of Markov Chains

Juang

1985

At&T Technical Journal

219

127

View full text Add to dashboard Cite

In this paper we discuss parameter estimation by means of the reestimation algorithm for a class of multivariate mixture density functions of Markov chains. The scope of the original reestimation algorithm is expanded and the previous assumptions of log concavity or ellipsoidal symmetry are obviated, thereby enhancing the modeling capability of the technique. Reestimation formulas in terms of the well-known forward-backward inductive procedure are also derived.

show abstract

Microprocessor implementation of an LPC-based isolated word recognizer

Ackenhusen¹,

Rabiner²

ICASSP '81. IEEE International Conference on Acoustics, Speech, and Signal Processing

View full text Add to dashboard Cite

A digital-based isolated word recognition system has been implemented in a module of dedicated hardware that uses a microprocessor and programmable digital signal processing circuitry. The recognizer is based upon the minimum prediction residual principle of Itakura. The recognition algorithm has been developed and tested on a general-purpose minicomputer and array processor, where it has been shown to be suitable for several recognition tasks. The recognition hardware consists of an Intel 8086 16-bit microprocessor operating in parallel with a digital speech processing peripheral (DSPP) tailored to the algorithm. The microprocessor performs the supervisory and decision operations; the DSPP performs the 200,000T + 4,500N multiply-add operations (and associated data transfers) associated with the recognition of a word of duration T sec from an N word vocabulary with I template per word. The recognizer is compact (board area of 250 sq. in.) and inexpensive (commercial component cost of about $1200 for 40 word templates).

show abstract

On the effects of varying analysis parameters of an LPC based, isolated word recognizer

Cited by 4 publications

References 0 publications

On the effects of varying filter bank parameters on isolated word recognition

On the effects of varying filter bank parameters on isolated word recognition

Maximum-Likelihood Estimation for Mixture Multivariate Stochastic Observations of Markov Chains

Microprocessor implementation of an LPC-based isolated word recognizer

Contact Info

Product

Resources

About