2023
DOI: 10.1109/taslp.2023.3275032
|View full text |Cite
|
Sign up to set email alerts
|

Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder

Abstract: There is a need to improve the synthesis quality of HiFi-GAN-based real-time neural speech waveform generative models on CPUs while preserving the controllability of fundamental frequency (f o ) and speech rate (SR). For this purpose, we propose Harmonic-Net and Harmonic-Net+, which introduce two extended functions into the HiFi-GAN generator. The first extension is a downsampling network, named the excitation signal network, that hierarchically receives multi-channel excitation signals corresponding to f o . … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
references
References 50 publications
0
0
0
Order By: Relevance