2022
DOI: 10.48550/arxiv.2203.15643
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Nix-TTS: Lightweight and End-to-End Text-to-Speech via Module-wise Distillation

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…Light-Speech [24] uses neural architecture search to achieve 15X model compression, resulting in the final model with 1.8M parameters. Nix-TTS [25] builds an end-to-end TTS system with 5.23M using knowledge distillation. These previous works built models on a single-speaker dataset of LJSpeech.…”
Section: B Track 2: Lightweight Ttsmentioning
confidence: 99%
“…Light-Speech [24] uses neural architecture search to achieve 15X model compression, resulting in the final model with 1.8M parameters. Nix-TTS [25] builds an end-to-end TTS system with 5.23M using knowledge distillation. These previous works built models on a single-speaker dataset of LJSpeech.…”
Section: B Track 2: Lightweight Ttsmentioning
confidence: 99%
“…Recent attempts to build on-device neural TTS include On-device TTS [7], LiteTTS [8], PortaSpeech [9], LightSpeech [10] and Nix-TTS [11]. On-device TTS is slow and resource intensive since it is a modified Tacotron2 for mel spectrogram generation and uses WaveRNN for vocoder.…”
Section: Introductionmentioning
confidence: 99%