Text to speech (TTS) conversion is a system that can convert the written text into their corresponding speech. It is a very useful application for the visual and speech impaired person. The optimal character recognition (OCR) -based TTS system to help such visually challenged people by OCR has been proposed [1]. The resulting text from the OCR is converted into speech. They used the blind deconvolution method and pre-processing operation to remove the effect of noise and blur so that they can achieve the efficient result of the framework for visually challenged. Nowadays, high quality TTS software can be commercially available for different languages. The most used speech synthesis approaches are articulatory synthesis, formant synthesis, concatenative synthesis and hidden Markov model (HMM)-based model approach. Each approach has their reasonable advantages and disadvantages based on the usage of languages. *Author for correspondence Among them, the concatenative synthesis approach is used in our system because it can generate natural sound as a consequence of pre-recorded sound. The speech quality and the size of the system is a tradeoff based on the different speech units for concatenation. The current speech units are word, syllable, phoneme, di-phone, tri-phone and so on. Many TTS systems proposed by [2−6] have been implemented by using concatenative method based on different speech units and they can generate high quality synthesized speech. A numerical TTS synthesis system for three languages: Marathi, Hindi and English languages is proposed by [7]. They used the approach that combined rule-based approach and concatenation-based approach. They used all utterances of sound units have been used for concatenation and generation of speech signal. They compare two Arabic text to speech systems: two screen readers, namely, non-visual desktop access (NVDA) and integrated bilingual solution for the blind or visually impaired, in the Arab (IBSAR) [8]. They tested the quality of two systems in terms of
This paper discusses the approach used to develop a Text-to-Speech (TTS) synthesis system for the Myanmar language. Concatenative method has been used to develop this TTS system using phoneme as the basic units for concatenation. In this proposed system, phoneme plays an important role so that Myanmar phoneme inventory is presented in detail. In Myanmar language, schwa is the only vowel that is allowed in a minor syllable or consonant that has half-sound of the original one. If these half sound can be handled, the TTS quality will be high. After analyzing the number of phoneme and half-sound consonant to be recorded, create the Myanmar phoneme speech database which contains total 157 phoneme speech sounds that can speech out for all Myanmar texts. These phonemes are fetched according to the result from the phonetic analysis modules and concatenated them by using proposed new phoneme concatenation algorithm. According to the experimental results, the system achieved the high level of intelligibility and acceptable level of naturalness. As the application area, it is intended for the resource limited device to use as language learning app and so on.
Among the speech synthesis approach, concatenative method is one of the most popular method which can produce more natural sounding speech output. The most important challenge in this method is choosing an appropriate unit for creating a database. The present used speech units are word, syllable, di-phone, tri-phone and phoneme. The speech quality may be trade-off between the selected speech units. This paper presents the three speech synthesis system of Myanmar language, respectively based on syllable, di-phone and phoneme speech units by using concatenation method. Then, we compare the speech quality of the three systems, using the subjective tests.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.