Techniques in Speech Acoustics

Harrington, Johnathan; Cassidy, Steve

doi:10.1162/coli.2000.26.2.294b

Cited by 40 publications

(12 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Both pitch-asynchronous linear prediction ͑LP͒ procedures benefited from signal pre-emphasis, which is intended to cancel out the spectral tilt of the glottal source and increase the accuracy of the formant estimates. 12 The formant estimates obtained using the covariance algorithm only on the closed phase with or without pre-emphasis were considerably less accurate, despite the occasional advocacy of this latter technique as the most accurate. 13 Closed phase linear prediction analysis is extremely sensitive to properly locating the analysis window, and it appears that it is not possible to overcome these problems using a manual system such as PRAAT. Reassigned spectrograms were computed for a brief excerpt from the middle of each vowel encompassing five or six glottal cycles.…”

Section: Methodsmentioning

confidence: 99%

Accuracy of formant measurement for synthesized vowels using the reassigned spectrogram and comparison with linear prediction

Fulop

2010

The Journal of the Acoustical Society of America

View full text Add to dashboard Cite

This brief report describes a small study which was undertaken with nine synthetic vowel tokens, in an effort to demonstrate the validity of the reassigned spectrogram as a formant measurement tool. The reassigned spectrogram's performance is also compared with that of a typical pitch-asynchronous linear predictive analysis and is found to be superior. In this study, reassigned spectrograms were further processed to highlight the formants and then were used to measure these synthetic vowel formants generally to within 0.5% of their known true values, far surpassing the accuracy of a typical linear predictive analysis procedure which was inaccurate by as much as 17%. The overall accuracy of reassigned spectrographic formant measurement is thus demonstrated in these cases.

show abstract

Section: Methodsmentioning

confidence: 99%

Accuracy of formant measurement for synthesized vowels using the reassigned spectrogram and comparison with linear prediction

Fulop

2010

The Journal of the Acoustical Society of America

View full text Add to dashboard Cite

show abstract

“…The vowel target was usually marked where Fl reached a maximum value in open vowels and where F2 reached a maximum/minimum value in front/back vowels (Harrington & Cassidy, 1999). If the formants showed either little change or no evidence of reaching an asymptote within the vowel, an intensity peak was sometimes used to position the vowel target; if there was no evidence of an intensity peak, then the vowel target was positioned at the vowel's acoustic midpoint.…”

Section: Methodsmentioning

confidence: 99%

Monophthongal vowel changes in Received Pronunciation: an acoustic analysis of the Queen's Christmas broadcasts

Harrington¹,

Palethorpe²,

Watson³

2000

Journal of the International Phonetic Association

Self Cite

156

102

View full text Add to dashboard Cite

In this paper we analyse the extent to which an adult's vowel space is affected by vowel changes to the community using a database of nine Christmas broadcasts made by Queen Elizabeth II spanning three time periods (the 1950's; the late 1960's/early 70's; the 1980's). An analysis of the monophthongal formant space showed that the first formant frequency was generally higher for open vowels, and lower for mid-high vowels in the 1960's and 1980's data than in the 1950's data, which we interpret as an expansion of phonetic height from earlier to later years. The second formant frequency showed a more modest compression in later, compared with earlier years: in general, front vowels had a decreased F2 in later years, while F2 of the back vowels was unchanged except for [u] which had a higher F2 in the 1960's and 1980's data. We also show that the majority of these Fl and F2 changes were in the direction of the vowel positions of 1980's Standard Southern British speakers reported in Deterding (1997). Our general conclusion is that there is evidence of accent change within the same individual over time and that the Queen's vowels in the Christmas broadcasts have shifted in the direction of a more mainstream form of Received Pronunciation.

show abstract

“…At the end of this step, the energy of each band-pass filter is calculated. Then, in the fifth-step, it is subjected to the logarithmic compression for mimicking the humans' audio perception [65]. The discrete cosine transform (DCT) of the logarithmic output is taken to de-correlate the coefficients, and hence, the static features of the input signal are obtained, at the final step.…”

Section: Mel-frequency Cepstral Coefficientsmentioning

confidence: 99%

A Review on Feature Extraction for Speaker Recognition under Degraded Conditions

DİŞKEN

Tüfekçi

Sarıbulut

et al. 2016

IETE Technical Review

View full text Add to dashboard Cite

Speech is a signal that includes speaker's emotion, characteristic specification, phonemeinformation etc. Various methods have been proposed for speaker recognition by extracting specifications of a given utterance. Among them, short-term cepstral features are used excessively in speech, and speaker recognition areas because of their low complexity, and high performance in controlled environments. On the other hand, their performances decrease dramatically under degraded conditions such as channel mismatch, additive noise, emotional variability, etc. In this paper, a literature review on speaker-specific information extraction from speech is presented by considering the latest studies offering solutions to the aforementioned problem. The studies are categorized in three groups considering their robustness against channel mismatch, additive noise, and other degradations such as vocal effort, emotion mismatch, etc. For a more understandable representation, they are also classified into two tables by utilizing their classification methods, and used data-sets.

show abstract

Techniques in Speech Acoustics

Cited by 40 publications

References 0 publications

Accuracy of formant measurement for synthesized vowels using the reassigned spectrogram and comparison with linear prediction

Accuracy of formant measurement for synthesized vowels using the reassigned spectrogram and comparison with linear prediction

Monophthongal vowel changes in Received Pronunciation: an acoustic analysis of the Queen's Christmas broadcasts

A Review on Feature Extraction for Speaker Recognition under Degraded Conditions

Contact Info

Product

Resources

About