International Conference on Acoustics, Speech, and Signal Processing
DOI: 10.1109/icassp.1989.266404
|View full text |Cite
|
Sign up to set email alerts
|

Neural network based generation of fundamental frequency contours

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 31 publications
(17 citation statements)
references
References 3 publications
0
17
0
Order By: Relevance
“…Our model has the advantage over these purely rule-based models in that it can be automatically trained. Since the structure is based in linguistic theory, it also covers a broader range of prosodic characteristics than do other automatically-trainable models, e.g., [38]. A potential advantage of the dynamical system model is that it can handle vector processes, allowing for joint modeling of energy and .…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Our model has the advantage over these purely rule-based models in that it can be automatically trained. Since the structure is based in linguistic theory, it also covers a broader range of prosodic characteristics than do other automatically-trainable models, e.g., [38]. A potential advantage of the dynamical system model is that it can handle vector processes, allowing for joint modeling of energy and .…”
Section: Discussionmentioning
confidence: 99%
“…For example, consider the following relatively simple case with parameter tying only in the observation equation: (38) In this case, both and have the data broken up into regions according to the factor , but for the data must be further partitioned according to the factor . If can take on two values ( and ) then the data is broken up to define two parameters and for every .…”
Section: Parameter Tyingmentioning
confidence: 99%
“…Among several database-driven methods, CART and neural network models are more popular (Breiman et al 1984;Dusterhoff et al 1999;Goubanova & King 2008;Cosi et al 2001;Tesser et al 2004;Vainio & Altosaar 1998;Vegnaduzzo 2003). Several models based on neural network principles are described in the literature for predicting the intonation patterns of syllables in continuous speech (Scordilis & Gowdy 1989;Vainio & Altosaar 1998;Vainio 2001;Sonntag et al 1997;Buhmann et al 2000;Hwang & Chen 1994). Scordilis & Gowdy (1989) used neural networks in a parallel and distributed manner to predict the average F 0 value for each phoneme, and the temporal variations of F 0 within a phoneme.…”
Section: Input Layermentioning
confidence: 99%
“…Several models based on neural network principles are described in the literature for predicting the intonation patterns of syllables in continuous speech (Scordilis & Gowdy 1989;Vainio & Altosaar 1998;Vainio 2001;Sonntag et al 1997;Buhmann et al 2000;Hwang & Chen 1994). Scordilis & Gowdy (1989) used neural networks in a parallel and distributed manner to predict the average F 0 value for each phoneme, and the temporal variations of F 0 within a phoneme. The network consists of two levels: macroscopic and microscopic levels.…”
Section: Input Layermentioning
confidence: 99%
“…Model-based (e.g., regression trees, HMMs, neural networks) or sample-based (e.g., vector quantization, contour selection) mapping tools are then used to achieve the best phonetic prediction according to a distance metric, generally Root Mean Square (RMS) error. Prediction is generally performed with separate trainable models for f0 (Ljolje and Fallside 1986;Scordilis and Gowdy 1989;Sagisaka 1990;Traber 1992), for phoneme durations (Klatt 1979;O'Shaughnessy 1981;Bartkova and Sorin 1987;Riley 1992;van Santen 1992) and, more recently, for intensity profiles (Trouvain, Barry et al 1998). With the development of corpus-based synthesis techniques and powerful mapping tools (Campbell 1992;van Santen 2002), multiparametric prosodic models (Mixdorff and Jokisch 2001;Tesser, Cosi et al 2004) now tend to use general-purpose and theory-neutral tools.…”
Section: Introductionmentioning
confidence: 99%