Simulation of glottal volume flow and vocal fold tissue movement was accomplished by numerical solution of a time-dependent boundary value problem, in which nonuniform, orthotropic, linear, incompressible vocal fold tissue media were surrounded by irregularly shaped boundaries, which were either fixed or subject to aerodynamic stresses. Spatial nonuniformity of the tissues was of the layered type, including a mucosal layer, a ligamental layer, and muscular layers. Orthotropy was required to stabilized the vocal folds longitudinally and to accomodate large variations in muscular stress. Incompressibility and vertical motions at the golttis played an important role in producing and sustaining phonation. A nominal configuration for male fundamental speaking pitches was selected, and the regulation of fundamental frequency, intensity, average volume flow, and vocal efficiency was investigated in terms of variations around this nominal configuration. Parameters which were varied consisted of geometrical factors such as length, thickness, and depth, factors for shaping the glottis, as well as tissue elasticities, tissue viscosities, and subglottal pressure. Since nonlinear stress-strain properties were not included, subglottal pressure did not produce a pronounced effect upon fundamental frequency under these somewhat edealized conditions F0 rasing correlated strongly with increased tension in the ligament, and somewhat with increasing tension in the vocalis. F0 lowering correlated with increase in vocal fold length when the tensions were held constant, but not with increase in vocal fold thickness. Vocal intensity and efficiency are shown to have local maxima as the configurational parameters are varied one at a time. It appears that oral acoustic power output and vocal efficiency can be maximized by proper adjustments of longitudinal tension of nonmuscular (mucosal and ligamental) tissue layers in relation to muscular layers. Quantitative verification of the "body-cover" theory is therefore suggested, and several further implications with regard to control of the human larynx are considered.
A new algorithm to track automatically speech formant frequencies have been developed. Dynamic programming is used to optimize formant trajectory estimates by imposing appropriate frequency continuity constraints. The continuity constraints are modulated by a stationarity function. The formant frequencies are selected from candidates proposed by solving for the roots of the linear predictor polynomial computed periodically from the speech waveform. The local costs of all possible mappings of the complex roots to formant frequencies are computed at each frame based on the frequencies and bandwidths of the component formants for each mapping. The cost of connecting each of these mappings with each of the mappings in the previous frame is then minimized using a modified Viterbi algorithm. Two sentences spoken by 88 males and 43 females were analyzed. The first three formants were tracked correctly in all sonorant regions in over 80% of the sentences. These performance results are based on spectrographic analysis and informal listening to formant-synthesized speech.
Voicing perception for final stops was studied for impaired- and for normal-hearing listeners when cues in naturally spoken syllables were progressively neutralized. The syllables were ten different utterances of /daep, daek, daet, daeb, daeg, daed/ spoken in random order by a male. The cue modifications consisted progressively of neutralized vowel duration, equalized occlusion duration, burst deletion, murmur deletion, vowel-transition interchange, and transition deletion. The impaired subjects had moderate-to-severe losses and showed at least 70% correct voicing for the unmodified syllables. For the voiced stops, vowel-duration adjustment and murmur deletion each resulted in significant reductions in voicing perception for more than one-third of the impaired listeners; all normals showed good performance following neutralization of these cues. For the voiceless stops, large percentages of both listener groups showed decreased voicing perception due to the burst deletion, though a majority of both groups performed well above change even after the vowel-duration adjustment and the burst deletion. When the vowel off-going transitions were exchanged between cognate syllables in given pairs, the effect on voicing perception exhibited by many impaired- and all normal-hearing listeners implicated the vowel transitions as an important additional source of cues to final-stop voicing perception.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.