2018 IEEE Spoken Language Technology Workshop (SLT) 2018
DOI: 10.1109/slt.2018.8639550
|View full text |Cite
|
Sign up to set email alerts
|

Neural TTS Voice Conversion

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
16
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 9 publications
(16 citation statements)
references
References 4 publications
0
16
0
Order By: Relevance
“…In this paper we show that we can get a considerable quality improvement by modifying a TTS system that produced the WORLD vocoder parameters [6] to predict parameters for LPCNet [8]. As in the previous work [6], we conduct multiple adaptation experiments, applied on multiple VCTK voices [9] and show that the new system has much better quality and similarity to the target voices but can still run much faster than real-time in a single-CPU mode.…”
Section: Introductionmentioning
confidence: 87%
See 4 more Smart Citations
“…In this paper we show that we can get a considerable quality improvement by modifying a TTS system that produced the WORLD vocoder parameters [6] to predict parameters for LPCNet [8]. As in the previous work [6], we conduct multiple adaptation experiments, applied on multiple VCTK voices [9] and show that the new system has much better quality and similarity to the target voices but can still run much faster than real-time in a single-CPU mode.…”
Section: Introductionmentioning
confidence: 87%
“…In the current work, the prosody generation and adaptation network follows the one presented in our previous work [6], where one can refer to for more details. It generates a 4dimensional prosody vector per TTS unit, comprising the unit's log-duration, initial log-pitch, final log-pitch and logenergy.…”
Section: Prosody Generatormentioning
confidence: 99%
See 3 more Smart Citations