2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221)
DOI: 10.1109/icassp.2001.941034
|View full text |Cite
|
Sign up to set email alerts
|

Duration modeling in a restricted-domain female-voice synthesis in Spanish using neural networks

Abstract: The objective of this paper is the accurate prediction of segmental duration in a Spanish text-to-speech system. There are many parameters that affect duration, but not all of them are always relevant. We present a complete environment in which to decide which parameters are more relcvant and the best way to code them. This work is the continuation of [I], where all efforts were dedicated to an unrestricted-domain database for a male voice. In this case, we are considering a female voice in a restricted-domain… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 6 publications
0
4
0
Order By: Relevance
“…We have applied the same methodology to a female speaker in a restricted-domain environment (Córdoba et al, 2001). The most remarkable differences with the unrestricted-domain database presented in this paper are the following: the improvements are bigger, up to 5%; all position and "number of units" parameters give consistent improvements (between 2 and 3·5%); syllable structure and function word mean an improvement close to 2·5%; the parameters related to phrase provide better results; the best parameter is "position of the word in the phrase"; and the window of five phonemes for the phonemes identity is 5% better than the window of three phonemes.…”
Section: Summary Of Results For Parameter Evaluationmentioning
confidence: 99%
See 1 more Smart Citation
“…We have applied the same methodology to a female speaker in a restricted-domain environment (Córdoba et al, 2001). The most remarkable differences with the unrestricted-domain database presented in this paper are the following: the improvements are bigger, up to 5%; all position and "number of units" parameters give consistent improvements (between 2 and 3·5%); syllable structure and function word mean an improvement close to 2·5%; the parameters related to phrase provide better results; the best parameter is "position of the word in the phrase"; and the window of five phonemes for the phonemes identity is 5% better than the window of three phonemes.…”
Section: Summary Of Results For Parameter Evaluationmentioning
confidence: 99%
“…The detailed figures of our database are shown in Table I. Throughout the paper we will also mention the results we have obtained in similar experiments using a female speaker in a restricted-domain environment (Córdoba, Montero, Gutierrez-Arriola & Pardo, 2001). This restricted-domain offers several advantages to the modelling: the variation in the different patterns is reduced, and there are more instances of each vector of parameters in the database.…”
Section: Contentsmentioning
confidence: 99%
“…Furthermore, artificial neural networks (Campbell 1992;Cordoba et al 2001), Bayesian networks models (Chien and Huang 2003;Goubanova and King 2008)) and instance-based algorithms (Lazaridis et al 2007) have also been introduced on the phone duration modeling task. In the following, we briefly mention some of the most frequently and successfully used methods for phone duration modeling (cf.…”
Section: Data-driven Techniquesmentioning
confidence: 99%
“…In German, a significant work has been done in [15], by capturing the types of interaction patterns affecting the duration. Neural network is used to predict the segmental duration in [16]. Objective evaluation of the model gave root mean squared error (RMSE) of 0.4536.…”
Section: Introductionmentioning
confidence: 99%