ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019
DOI: 10.1109/icassp.2019.8682804
|View full text |Cite
|
Sign up to set email alerts
|

LPCNET: Improving Neural Speech Synthesis through Linear Prediction

Abstract: Neural speech synthesis models have recently demonstrated the ability to synthesize high quality speech for text-to-speech and compression applications. These new models often require powerful GPUs to achieve real-time operation, so being able to reduce their complexity would open the way for many new applications. We propose LPCNet, a WaveRNN variant that combines linear prediction with recurrent neural networks to significantly improve the efficiency of speech synthesis. We demonstrate that LPCNet can achiev… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
335
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 371 publications
(338 citation statements)
references
References 17 publications
2
335
1
Order By: Relevance
“…In detail, the distribution is sharpened by directly reducing the generated scale parameters shown in Eq. (5). Because the buzziness and the hiss of synthetic speech are sensitive to the sharpness of the distribution, the scale parameters must be adjusted carefully.…”
Section: Conditional Distribution Sharpening For Mdn Modelmentioning
confidence: 99%
See 2 more Smart Citations
“…In detail, the distribution is sharpened by directly reducing the generated scale parameters shown in Eq. (5). Because the buzziness and the hiss of synthetic speech are sensitive to the sharpness of the distribution, the scale parameters must be adjusted carefully.…”
Section: Conditional Distribution Sharpening For Mdn Modelmentioning
confidence: 99%
“…In the LPCNet vocoder, the specification was almost the same as its original version [5]. In the frame-level network, the dimension of the two FC layers was set to 64.…”
Section: Neural Vocodersmentioning
confidence: 99%
See 1 more Smart Citation
“…Linear predictive coding is known to be a suitable model of the vocal tract response [15] and is still used in SOTA approaches for speech coding and synthesis [14]. Given a signal x k at sample k, LPC can be described as the linear combinationx…”
Section: Linear Predictive Codingmentioning
confidence: 99%
“…LPC is able to perfectly model a superposition of multiple sinusoidals given enough coefficients. Due to this property, LPC finds a use case in speech coding and synthesis [14]. Yet, it is often only applied on time-domain signals as a post-processing step.…”
Section: Introductionmentioning
confidence: 99%