2014 International Conference on Asian Language Processing (IALP) 2014
DOI: 10.1109/ialp.2014.6973509
|View full text |Cite
|
Sign up to set email alerts
|

Influence of various asymmetrical contextual factors for TTS in a low resource language

Abstract: The generalized statistical framework of Hidden Markov Model (HMM) has been successfully applied from the field of speech recognition to speech synthesis. In this paper, we have applied HMM-based Speech Synthesis (HTS) method to Gujarati (one of the official languages of India). Adaption and evaluation of HTS for Gujarati language has been done here. In addition, to understand the influence of asymmetrical contextual factors on quality of synthesized speech, we have conducted series of experiments. Evaluation … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2018
2018
2018
2018

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 17 publications
0
3
0
Order By: Relevance
“…Since speech is a sequential data, extracting the contextual features from the speech, captures the local features (including coarticulation) and preserves the crucial harmonics [31,32]. In speech perception, it has been shown that the surrounding acoustic context, impacts the human perception [33][34][35].…”
Section: Effect Of Contextual Information In Nam2whsp Systemmentioning
confidence: 99%
See 1 more Smart Citation
“…Since speech is a sequential data, extracting the contextual features from the speech, captures the local features (including coarticulation) and preserves the crucial harmonics [31,32]. In speech perception, it has been shown that the surrounding acoustic context, impacts the human perception [33][34][35].…”
Section: Effect Of Contextual Information In Nam2whsp Systemmentioning
confidence: 99%
“…Inspired by the study reported in [32], we also analyze the importance of training models by taking an asymmetric contextual frames as an input to the network (Panel II in Figure 2). No significant variations could be observed in terms of MCD scores for GAN-based systems (as shown in Figure 2 (c)), whereas significant improvement in the performance of the GAN-based system over DNN-based system, is observed in terms of PESQ score (as shown in Figure 2 (d)) (notably 4LC1R).…”
Section: Effect Of Contextual Information In Nam2whsp Systemmentioning
confidence: 99%
“…To overcome this issue, Temporal-Context (TC) INCA algorithm was proposed [22], which tries to incorporate the contextual information. Furthermore, since speech is a sequential data, extracting the contextual features from the speech, captures the local features (including coarticulation) and preserves the crucial harmonics [23,24]. It is well known in the speech literature that the surrounding acoustic context affects the human speech perception [25][26][27].…”
Section: Introductionmentioning
confidence: 99%