2006
DOI: 10.1016/j.specom.2005.10.003
|View full text |Cite
|
Sign up to set email alerts
|

Modeling stylized invariance and local variability of prosody in text-to-speech synthesis

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2006
2006
2018
2018

Publication Types

Select...
3
2
2

Relationship

1
6

Authors

Journals

citations
Cited by 11 publications
(10 citation statements)
references
References 25 publications
0
10
0
Order By: Relevance
“…One of the basic characteristics of natural speech is its variability ͑Braun et al, Chu et al, 2006͒. In order to generate models for speech production, synthesis, and perception this variability should be accounted for appropriately.…”
Section: Emotional Regions In F0 Mean-range Spacementioning
confidence: 99%
See 2 more Smart Citations
“…One of the basic characteristics of natural speech is its variability ͑Braun et al, Chu et al, 2006͒. In order to generate models for speech production, synthesis, and perception this variability should be accounted for appropriately.…”
Section: Emotional Regions In F0 Mean-range Spacementioning
confidence: 99%
“…For example, its implementation in emotional speech synthesis is limited ͑Cowie et al, 2001͒ because it does not specifically account for the variability present in the natural speech ͑Braun et Chu et al, 2006;Pell, 2001͒. In the traditional analysis, an emotional utterance is represented as a point in the parameter space. We suggest a new model where each utterance is represented by an "emotional region" in the parameter space.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Because different pronunciations of the same word with a good prosody can result in very different prosodic patterns [12], it is necessary to have several gold standard references at our disposal when it comes to comparing prosodic contours in the task of prosody assessment. We chose to use the k-means algorithm to generate several reference contours by clustering similar prosodic contours together, as suggested by Cheng [7].…”
Section: Gold-standard Native Referencesmentioning
confidence: 99%
“…When different speakers are required to read the same sentence, prosodic diversity is usually observed. This diversity involves pitch accents placement and shape, phrase breaks location and other aspects of prosody, and it is even observed when it is a single speaker who reads the same sentence at two different times [2]. The variability observed in pitch accents and phrase breaks placement has suggested a distinction between optional and compulsory prosodic events, a distinction that, in turn, has been used to reformulate the evaluation of automatic predictors ( [3], [4], [5]).…”
Section: Introductionmentioning
confidence: 99%