A speech synthesizer which sounds similar to a human voice is preferred over a robotic voice, and hence to increase the naturalness of a speech synthesizer an efficacious prosody model is imperative. Hence, this paper is focused on developing a prosody prediction model using sentiment analysis for a Tamil speech synthesizer. Two variations of prosody prediction models using SentiWordNet are experimented: one without a stemmer and the other with a stemmer. The prosody prediction model with a stemmer performs much more efficiently than the one without a stemmer as it tackles the highly agglutinative and inflectional words in Tamil language in a better way and is exemplified clearly, in this paper. The performance of the prosody prediction model with a stemmer has a higher classification accuracy of 77% on the test set in comparison to the 57% accuracy by the prosody model without a stemmer.
The primary aim of Human-Computer Interaction (HCI) is to deliver the power of computers and communication systems to people in an easily accessible and understandable form. HCI in a person’s native/first language is always invigorate. Developing a Tamil Text-To-Speech (TTS) system will facilitate a convenient medium of interaction for people who speak Tamil language. This paper emphasizes on the development of pronunciation models, a vital component of a Tamil TTS. Developing a pronunciation model for Tamil is more arduous when compared to other languages due to the non-triviality between the letter to sound correspondence. Veritably, two syllable-based pronunciation models developed by us are discussed in this paper. First, is a syllable-centric rule-based pronunciation model that generates a well-founded training data which is ingrained into the second, Conditional Random Field (CRF) enforced model. It is evident that both of these models are dominions with a high Mean Similarity Score of 0.97 and 0.94 respectively in comparison to the other existing rule driven and data driven models in the literature. These syllable-based pronunciation models will enrich the performance of a Tamil TTS.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.