Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing 2023
DOI: 10.18653/v1/2023.emnlp-main.267
|View full text |Cite
|
Sign up to set email alerts
|

DPP-TTS: Diversifying prosodic features of speech via determinantal point processes

Seongho Joo,
Hyukhun Koh,
Kyomin Jung

Abstract: With the rapid advancement in deep generative models, recent neural Text-To-Speech (TTS) models have succeeded in synthesizing humanlike speech. There have been some efforts to generate speech with various prosody beyond monotonous prosody patterns. However, previous works have several limitations. First, typical TTS models depend on the scaled sampling temperature for boosting the diversity of prosody. Speech samples generated at high sampling temperatures often lack perceptual prosodic diversity, thereby ham… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 19 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?