2014
DOI: 10.1109/jstsp.2013.2294938
|View full text |Cite
|
Sign up to set email alerts
|

Integrated Expression Prediction and Speech Synthesis From Text

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(10 citation statements)
references
References 26 publications
0
10
0
Order By: Relevance
“…In [3], a multiple linear regression model is presented to predict the most appropriate hidden Markov model (HMM) parametric voice style from a created set, and has also been implemented and evaluated including statistical models such as Gaussian mixture models (GMM) [2]. Moreover, the prediction of expressions from text and the synthesis of a particular expression have been integrated together [14]. In addition, prosody has been structured as a multi-level hierarchy for emotional speech synthesis [5], and its correlation with both hierarchical information structure and discourse has also been analyzed for speech synthesis purposes [15,16,17].…”
Section: Related Workmentioning
confidence: 99%
“…In [3], a multiple linear regression model is presented to predict the most appropriate hidden Markov model (HMM) parametric voice style from a created set, and has also been implemented and evaluated including statistical models such as Gaussian mixture models (GMM) [2]. Moreover, the prediction of expressions from text and the synthesis of a particular expression have been integrated together [14]. In addition, prosody has been structured as a multi-level hierarchy for emotional speech synthesis [5], and its correlation with both hierarchical information structure and discourse has also been analyzed for speech synthesis purposes [15,16,17].…”
Section: Related Workmentioning
confidence: 99%
“…Also, in order to account for the practically infinite amount of expressive styles, automatized techniques must be applied to deal with data. The aforementioned problem of predicting expressiveness from text is addressed in for instance Chen et al (2014); Jauk et al (2016); Jauk and Bonafonte (2016a); Lorenzo Trueba (2016). The methods proposed in these works can also be used for data clustering in order to gain training data, as done for example by Watts (2012)).…”
Section: Expressive Speech Synthesismentioning
confidence: 99%
“…Emotional Text-To-Speech (TTS) is a challenging but important part in speech synthesis since rendering emotion makes speech sound more natural [1]. It permits to convey essential nonlinguistic information that can be extracted from text in addition to the commonly modelled linguistic aspects, such as syllable stress and punctuation.…”
Section: Introductionmentioning
confidence: 99%
“…Without this mapping the speech synthesis system cannot determine the appropriate expression cluster for synthesizing a given sentence. In [1], the authors propose a method for expression prediction from text and speech, in which both the expression predictor and speech synthesizer share the same training data. This method permits to model intra-speaker and inter-speaker variabilities that influence expression prediction and represent a higher number of expressions than the typical limited set of emotions of text predictor methods.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation