2016
DOI: 10.15388/informatica.2016.100
|View full text |Cite
|
Sign up to set email alerts
|

Automatic Parameters Estimation of the D. Klatt Phoneme Duration Model

Abstract: Phoneme duration modelling is one of the stages in prosody modelling for text-to-speech systems. The rule-based phoneme duration model proposed by Klatt (1979) is still quite a popular method. One of the main shortcomings of this method is that the values of the parameters are selected in an experimental way. This work proposes a new iterative algorithm for the automatic estimation of the factors for the Klatt model using the corpus of an annotated audio record of the speaker. The phoneme duration models were … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…The conclusions briefly discuss the possibilities of using this method for the Lithuanian language. Comparable results of Lithuanian phoneme duration models were obtained by Kasparaitis and Beniušė (2016). For duration modelling, the rule-based model proposed by Klatt (1979) was used.…”
Section: Sentence-level Intonation Modellingmentioning
confidence: 99%
“…The conclusions briefly discuss the possibilities of using this method for the Lithuanian language. Comparable results of Lithuanian phoneme duration models were obtained by Kasparaitis and Beniušė (2016). For duration modelling, the rule-based model proposed by Klatt (1979) was used.…”
Section: Sentence-level Intonation Modellingmentioning
confidence: 99%
“…Many of them are presented as solutions to the 'copy synthesis' problem, which is the problem to estimate the input parameters to reconstruct a speech signal using a speech synthesizer Copy synthesis is a difficult inverse problem because the mapping is non-linear and often is a 'from many to one' problem. One of them is that proposed by Kasparaitis [75], who proposed an iterative algorithm for the automatic estimation of the factors for the Klatt model using the corpus of an annotated audio record of the speaker. Another is that proposed by Laprie, [76], who describe an approach to track formant trajectories first, and to compute the amplitudes of the resonators by an algorithm derived from cepstral smoothing they called "true envelope".…”
Section: Experiments 4: Some Examples Produced For the Italian Languagementioning
confidence: 99%
“…Despite the fact that duration models of Lithuanian sounds (Norkevičius and Raškinis, 2008), (Kasparaitis and Beniušė, 2016) and the intonation model of Lithuanian sentences (Vaičiūnas et al, 2016) have been developed in recent years, they will not be used in this work because only the phoneme-based synthesizer has the duration model implemented at the moment. Phonemes and diphones will be cut out of the recordings without any modifications.…”
Section: The Problem Of Missing Diphonesmentioning
confidence: 99%