A technique for the measurement of vocal-tract formant frequencies and bandwidths during voiced speech is described. A theoretical justification for the method is presented, based on a model of the vocal tract that is linear and stationary over time intervals of the order of one pitch period (approximately 0.01 sec). In brief, the technique consists of selecting a portion of one pitch period of a speech waveform during which the glottis is closed. This finite-duration signal is approximated in a weighted-least-squares sense by a function of the form f̂(t) = ∑ i−1Ne−πBit(ai cos2πFit+ci sin2πFit)+a0, where ai, ci, Fi, and Bi are selected to minimize the weighted-squared error between f̂(t) and the actual speech signal of interest, f(t). Bi and Fi are estimates of the bandwidth and frequency, respectively, of the ith formant. A digital-computer program was used to perform the minimization. The program was used to determine the bandwidths and frequencies of the first four formants of the vowels /i/, /ɔ/, and /ɑ/ in the context b(vowel)t. Two male speakers produced most of the speech waveforms. The technique seems to be most reliable for the first two formants. It yields results that appear to have a smaller variance than those obtained by previously reported methods, although the variance is still substantial.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.