In this study, vocal tract area functions for one American English speaker, recorded using magnetic resonance imaging, were used to simulate and analyze the acoustics of vowel nasalization. Computer vocal tract models and susceptance plots were used to study the three most important sources of acoustic variability involved in the production of nasalized vowels: velar coupling area, asymmetry of nasal passages, and the sinus cavities. Analysis of the susceptance plots of the pharyngeal and oral cavities, -(B(p)+B(o)), and the nasal cavity, B(n), helped in understanding the movement of poles and zeros with varying coupling areas. Simulations using two nasal passages clearly showed the introduction of extra pole-zero pairs due to the asymmetry between the passages. Simulations with the inclusion of maxillary and sphenoidal sinuses showed that each sinus can potentially introduce one pole-zero pair in the spectrum. Further, the right maxillary sinus introduced a pole-zero pair at the lowest frequency. The effective frequencies of these poles and zeros due to the sinuses in the sum of the oral and nasal cavity outputs changes with a change in the configuration of the oral cavity, which may happen due to a change in the coupling area, or in the vowel being articulated.
Researchers in the past have suggested several acoustic correlates of nasalization including extra pole-zero pairs near the first formant (F1), a reduction in F1 amplitude, and an increase in F1 bandwidth. Even though these correlates have been known for a long time, considerable work is still needed to automate the extraction of acoustic parameters (APs) for nasalization. This work looked at 37 different APs which were pared down to 8 APs based on F statistic obtained by ANOVA. In preliminary experiments, an accuracy of 69.79% has been obtained for the task of discriminating between oral and nasalized vowels on the TIMIT database using a support vector Machine (SVM)-based classifier. The classification was done on a frame basis, and a segment was declared nasalized if more than 30% of the frames were found to be nasalized. Note that all vowels adjacent to nasal consonants were assumed to be nasalized. Thus, the accuracy may actually be higher since some vowels before nasal consonants may not be nasalized. Further, these results were obtained by using a linear kernel in SVMs. We hope the results would improve when a radial basis function kernel is used. [Work supported by Honda and NSF Grant BCS0236707.]
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.