Research in speech synthesis is presently concerned with methods for improving intelligibility and naturalness. But most work to date has dealt with intelligibility, i.e., the synthesized speech is intelligible but monotone and machinelike in quality. Although this type of speech is useful in some applications, the users of speech synthesizers are demanding a more natural sounding synthesized speech. We are addressing this issue in our research. In particular we are studying the influence of the glottal source function on the production of synthesized speech. Our data base consists of inverse filtered speech, ultra-high-speed laryngeal films, and the electroglottograph waveforms, all temporally synchronized in our experiments. We use the experimentally derived source waveforms obtained from these various method as the excitation for a serial/parallel Klatt formant synthesizer to synthesize sentences. We then evaluate the naturalness of this synthetic speech by listening tests. Our goal is to rank order the contribution of different source parameters to the naturalness of synthetic speech. More specifically, we report on our results to date of the effect of source-tract interaction upon the production of natural sounding speech.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.