2000
DOI: 10.1016/s0165-1684(00)00099-2
|View full text |Cite
|
Sign up to set email alerts
|

Speech formant frequency estimation: evaluating a nonstationary analysis method

Abstract: The objective of this paper is to critically evaluate the performance of a nonstationary analysis method in tracking speech formant frequencies as they change with time due to the natural variations in the vocal-tract system during speech production. The method of instantaneous frequency estimation is applied to the tracking of speech formant frequencies to observe the time variations in the vocal-tract system characteristics within a pitch period. An implementation of an instantaneous frequency estimator base… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
5
0

Year Published

2009
2009
2024
2024

Publication Types

Select...
8
1

Relationship

0
9

Authors

Journals

citations
Cited by 13 publications
(5 citation statements)
references
References 9 publications
0
5
0
Order By: Relevance
“…1 Block disgram of cepstrum method 1.3 Linear prediction cepstrum estimation method(LPC) Linear prediction cepstrum estimation method (LPC) is an effective method for estimating the spectral envelope, which is derived from linear analysis of the vocal tract filter, according to the channel filter to find the resonance peak. All samples of speech signal can be represented by a weighted sum of several samples in front of it, namely, we use a plurality of samples to predict the current value [8] . The so-called p-order linear prediction, is based on the p sample signals such as x(n-1),x(n-2),…,x(n-p) in front weighted to predict the current value, namely:…”
Section: Short-time Fourier Transform Methodsmentioning
confidence: 99%
“…1 Block disgram of cepstrum method 1.3 Linear prediction cepstrum estimation method(LPC) Linear prediction cepstrum estimation method (LPC) is an effective method for estimating the spectral envelope, which is derived from linear analysis of the vocal tract filter, according to the channel filter to find the resonance peak. All samples of speech signal can be represented by a weighted sum of several samples in front of it, namely, we use a plurality of samples to predict the current value [8] . The so-called p-order linear prediction, is based on the p sample signals such as x(n-1),x(n-2),…,x(n-p) in front weighted to predict the current value, namely:…”
Section: Short-time Fourier Transform Methodsmentioning
confidence: 99%
“…The vocal tract is comprised of the mouth from the vocal organ to the lips and the nasal passage that's coupled to the oral tract by manner of the velum. The oral tract takes on many various lengths and cross sections by moving tongue, teeth, lips, and jaw and has a median length of seventeen cm during a typical man and shorter for females, and a spatially varying crosswise of up to 20cm2 [12].…”
Section: Vocal Tractmentioning
confidence: 99%
“…By using formant frequencies as the acoustic characteristics, a classification system can be created to identify and differentiate vowel sounds. Formant frequency refers to as the acoustic resonance of the human vocal tract which is the spectral peak of the spectrum [6], [7]. For an example, the formant frequency of vowel sound /i/ is the concentration of acoustic energy around a certain frequency in its speech sound waves as shown in Figure 1.…”
Section: Introductionmentioning
confidence: 99%