Natural Speech Synthesizer for Blind Persons Using Hybrid Approach

Gahlawat, Mukta; Malik, Amita; Bansal, Poonam

doi:10.1016/j.procs.2014.11.088

Cited by 12 publications

(4 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Though this architecture, helped in decreasing the training time by replacing memory units, it compromised the naturalness of speech. Blindfolded are used as test subjects [16] to evaluate the speech produced by the system since they are an important target audience for an emotional text-to-speech system.…”

Section: Speech Synthesis From Textmentioning

confidence: 99%

Prosodic Speech Synthesis of Narratives Depicting Emotional Diversity Using Deep Learning

Shah¹,

Gupta²,

Jardosh³

et al. 2021

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

Emotions are an essential part of speech or communication, which is why they cannot be neglected. The existing text-to-speech systems are not the most appropriate at conveying the emotions present behind the text. The systems can speak out the text monotonically lacking expressiveness. In this paper, an Expressive Textto-Speech Synthesis System (ETSSS) is proposed which considers the dominant emotions in the text provided. ETSSS works in two parts: first, it identifies the label behind the text, and second produces expressive speech. In the first part, the input text is given an emotional label. Later, this label is used to generate expressive and prosodic speech. Labeling emotions in ETSSS is carried out using BERT which has an accuracy of 94%, 90%, and 90% for disgust, amused, and anger, respectively. The speech synthesis with the emotion module of ETSSS achieves a good MOS of 3.8 for anger, 3.5 for disgust, and 3.2 for amused. IntroductionGenerating speech from text has been used for the past decade. It is important to note that emotions in speech play an important role. The three most common aspects of speech include intelligence, naturalness, and expressiveness. Prosody is defined as

show abstract

Section: Speech Synthesis From Textmentioning

confidence: 99%

Prosodic Speech Synthesis of Narratives Depicting Emotional Diversity Using Deep Learning

Shah¹,

Gupta²,

Jardosh³

et al. 2021

Advances in Intelligent Systems and Computing

View full text Add to dashboard Cite

show abstract

“…985.92 should be translated into a stream of phones using a graphemeto-phoneme as ‗nine hundred eighty-five and ninety-two coins' in English, ‗ ' in Marathi, ‗ ' in Hindi. The numerical parts of speech can be recognized but it needs speech library [3][4][5][6][15][16][17][18][19]. The next section can be seen how to prepare speech library.…”

Section: B Recognization Speech Unitsmentioning

confidence: 99%

“…A numerical method of expressing voice quality is called OQI (Overall Quality Issue). OQI is expressed in one number from 0 to 5 like a being the worse-quality and 5 the well-quality [19]. There are different criteria for awareness which is one type of understanding such as from much better to much worse for awareness as per Table-8.…”

Section: Overall Quality Issuementioning

confidence: 99%

Efficient Model for Numerical Text-To-Speech Synthesis System in Marathi, Hindi and English Languages

Ramteke¹,

Ramteke²

2017

IJIGSP

View full text Add to dashboard Cite

Abstract-The paper proposes a numerical TTSsynthesis system in Marathi, Hindi and English languages. The system is in audible forms based on the sounds generated from several numeric units. A hybrid technique is newly launched for a numerical text-to-speech technology. The technique is divided into different phases. These numerical phases include pre-processing, numeric unit detection, numeric and speech unit matching; speech unit concatenation and speech generation. In order to enhance the syntactic generation of audible forms in three languages, two discipline tests were performed. The prosodic test has been obtained for evaluating on the statistical readings. Overall quality issue (OQI) test is a subjective test which is performed by various persons who are aware of three mentioned languages. On the basis of two distinct parameters of OQI test, all scores are positively provided. Initial parameter compromises with listening quality. The second parameter, awareness rate improves a level of the intelligibility. The ultimate satisfactory results of artificial sound generation in three unrelated languages were touched to humankind voice.

show abstract

“…Screen magnifiers perform screen magnification for those who still have some degree of remnant vision [2]. Voice synthesizers literally read the text displayed on the computer screen [3], and Braille terminals display the text on the screen in Braille code [4].…”

Section: Introductionmentioning

confidence: 99%

Design and Implementation of a Low-Cost Printer Head for Embossing Braille Dots on Paper

2020

IJETER

View full text Add to dashboard Cite

This paper presents the design, implementation, and prototype of a low-cost Braille embossing mechanism. The proposal is a printer head integrating three hammers that, upon actuation, stamp readable dots on the paper. Inspired in the rotary cam-follower mechanism, the hammers are actuated by a single servomotor which rotation determines which hammer strikes the paper. Braille characters can be quickly embossed using the proposed printer head. Affordable and efficient Braille embossers for home use can be envisaged using this new action approach.

show abstract

Natural Speech Synthesizer for Blind Persons Using Hybrid Approach

Cited by 12 publications

References 6 publications

Prosodic Speech Synthesis of Narratives Depicting Emotional Diversity Using Deep Learning

Prosodic Speech Synthesis of Narratives Depicting Emotional Diversity Using Deep Learning

Efficient Model for Numerical Text-To-Speech Synthesis System in Marathi, Hindi and English Languages

Design and Implementation of a Low-Cost Printer Head for Embossing Braille Dots on Paper

Contact Info

Product

Resources

About