Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-264
|View full text |Cite
|
Sign up to set email alerts
|

Visualising Model Training via Vowel Space for Text-To-Speech Systems

Abstract: With the recent developments in speech synthesis via machine learning, this study explores incorporating linguistics knowledge to visualise and evaluate synthetic speech model training. If changes to the first and second formant (in turn, the vowel space) can be seen and heard in synthetic speech, this knowledge can inform speech synthesis technology developers. A speech synthesis model trained on a large General American English database was fine-tuned into a New Zealand English voice to identify if the chang… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(6 citation statements)
references
References 20 publications
0
6
0
Order By: Relevance
“…We extend the previous vowel space analysis approach [12] from mono-lingual to cross-lingual scenarios. Our vowel analysis method comprises the following steps.…”
Section: Vowel Space Analysis Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…We extend the previous vowel space analysis approach [12] from mono-lingual to cross-lingual scenarios. Our vowel analysis method comprises the following steps.…”
Section: Vowel Space Analysis Methodsmentioning
confidence: 99%
“…As in [12], the F 1 and F 2 values are extracted at the middle point of the chosen segment. We synthesize 100 samples per vowel, and set the median as the representative.…”
Section: Formant Estimationmentioning
confidence: 99%
See 3 more Smart Citations