1982
DOI: 10.1121/1.2019821
|View full text |Cite
|
Sign up to set email alerts
|

On the effects of varying filter bank parameters on isolated word recognition

Abstract: The vast majority of commercially available isolated word recognizers use a filter bank analysis as the front end processing for recognition. It is not well understood how the parameters of different filter banks (e.g., number of filters, types of filters, filter spacing, etc.) affect recognizer performance. In this talk we present results of performance evaluation of several types of filter bank analyzer in a speaker trained, isolated word recognition test using dialed-up telephone line recordings. We have st… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
7
0

Year Published

1991
1991
2019
2019

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 7 publications
(7 citation statements)
references
References 0 publications
0
7
0
Order By: Relevance
“…While the energy based spectral analysis schemes such as LPC work well under similar acoustic training and testing conditions, the system performance deteriorates significantly under adverse signal conditions [4]. For example, for an alphanumeric recognition task, the performance of the SPHINX system falls from 77-85% accuracy with matched training and testing recording environments to 19-37% accuracy on cross conditions [1].…”
Section: Motivationmentioning
confidence: 99%
“…While the energy based spectral analysis schemes such as LPC work well under similar acoustic training and testing conditions, the system performance deteriorates significantly under adverse signal conditions [4]. For example, for an alphanumeric recognition task, the performance of the SPHINX system falls from 77-85% accuracy with matched training and testing recording environments to 19-37% accuracy on cross conditions [1].…”
Section: Motivationmentioning
confidence: 99%
“…These include such standard methods as measurement of the discrete (fast) Fourier transform (FFT), all-pole minimum-phase linear prediction (LPC) methods, and autoregressive/moving average models (Allen and Rabiner 1977;Atal and Hanauer 1971;Cadzow 1982;Makhoul 1975;Markel and Gray 1976;Schafer and Rabiner 1971). Even the more traditional filter-bank method of spectral analysis is still used in some systems (Dautrich, Rabiner, and Martin 1983), particularly in hardware implementations. To emphasize spectral properties that are known to be important to a human listener, auditory models can be incorporated in the overall spectral representation (Cohen 1985;Ghitza 1986).…”
Section: Measurements and Modeling Of Speechmentioning
confidence: 99%
“…With suitable generation of noisy training data, an upper bound in system performance can be obtained with respect to many feature/model based compensation techniques (Dautrich et al, 1983;Gales, 1995;Gong, 1995). In particular, training data contamination may be conveniently adopted where the speech distortion in the real environment is unknown or would be difficult to model explicitly (Acero, Deng, Kristjansson & Zhang, 2000).…”
Section: Introductionmentioning
confidence: 99%
“…A different approach, known in the literature as training data contamination (Dautrich, Rabiner & Martin, 1983;Das, Bakis, Nadas, Nahamoo & Picheny, 1993;Gong, 1995), provides a way of training acoustic models which are more robust and representative of the given real noisy environment than those derived by training on the corresponding clean speech. In general, training of speech recognizers is accomplished by using large speech corpora.…”
Section: Introductionmentioning
confidence: 99%