Interspeech 2018 2018
DOI: 10.21437/interspeech.2018-2467
|View full text |Cite
|
Sign up to set email alerts
|

Expressive Speech Synthesis Using Sentiment Embeddings

Abstract: In this paper we present a DNN based speech synthesis system trained on an audiobook including sentiment features predicted by the Stanford sentiment parser. The baseline system uses DNN to predict acoustic parameters based on conventional linguistic features, as they have been used in statistical parametric speech synthesis. The predicted parameters are transformed into speech using a conventional high-quality vocoder. In this paper, the conventional linguistic features are enriched using sentiment features. … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…The study on expressive speech synthesis is focused on prosody modeling [22]- [24], where speech prosody generally refers to intonation, stress, speaking rate, and phrase breaks. Prosodic phrasing [25]- [28] plays an important role in both affective and linguistic expressions.…”
Section: Introductionmentioning
confidence: 99%
“…The study on expressive speech synthesis is focused on prosody modeling [22]- [24], where speech prosody generally refers to intonation, stress, speaking rate, and phrase breaks. Prosodic phrasing [25]- [28] plays an important role in both affective and linguistic expressions.…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, in audiobooks and news reading systems, the emotions when the system speaks are often determined by the content of the text. In such applications, it is desirable that the emotional parameters used in the speech synthesis system be automatically estimated from the input text (Bellegarda, 2011;Jauk et al, 2018;Shaikh et al, 2009;Sudhakar and Bensraj, 2014;Trilla and Alías, 2013;Vanmassenhove et al, 2016).…”
Section: Introductionmentioning
confidence: 99%