1985
DOI: 10.1121/1.2022951
|View full text |Cite
|
Sign up to set email alerts
|

Text to speech—An overview

Abstract: We discuss the design of a text-to-speech synthesizer, which accepts any type of English text as input, and creates an appropriate speech signal as output. Effective algorithms for converting text to sound must make use of intermediate data structures that systematically encode the degrees of freedom available to speakers of the language being synthesized. These data structures are an engineering approximation to what linguists call phonological representations; we will call them “P-structures.” Any TTS system… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
29
0

Year Published

1986
1986
2011
2011

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 43 publications
(29 citation statements)
references
References 0 publications
0
29
0
Order By: Relevance
“…Similar models were developed in other languages, such as: French (Bartkova and Sorin 1987), Swedish (Carlson and Granstrom 1986), German (Kohler 1988) and Greek (Epitropakis et al 1993), and varieties, such as: American English (Allen et al 1987;Olive and Liberman 1985) and Brazilian Portuguese (Simoes 1990). The major drawback of the rule-based approaches is the difficulty to represent and manually tune all the linguistic factors which influence the segmental duration in speech.…”
Section: Rule-based Techniquesmentioning
confidence: 89%
See 1 more Smart Citation
“…Similar models were developed in other languages, such as: French (Bartkova and Sorin 1987), Swedish (Carlson and Granstrom 1986), German (Kohler 1988) and Greek (Epitropakis et al 1993), and varieties, such as: American English (Allen et al 1987;Olive and Liberman 1985) and Brazilian Portuguese (Simoes 1990). The major drawback of the rule-based approaches is the difficulty to represent and manually tune all the linguistic factors which influence the segmental duration in speech.…”
Section: Rule-based Techniquesmentioning
confidence: 89%
“…The phone duration modeling methods reported in the literature can be broadly divided in two major categories: the rule-based (Allen et al 1987;Bartkova and Sorin 1987;Carlson and Granstrom 1986;Epitropakis et al 1993;Klatt 1979;Olive and Liberman 1985;Simoes 1990) and the data-driven methods (Chien and Huang 2003;Goubanova and King 2008;Lazaridis et al 2007;Möbius and Santen 1996;Riley 1992;Takeda et al 1989). Although, in the present study we consider only duration modeling techniques, which belong to the category of datadriven methods, for comprehensiveness of exposition we briefly review also the main rule-based techniques.…”
Section: Overview Of Phone Duration Modeling Techniquesmentioning
confidence: 99%
“…Initially a set of intrinsic (starting) values was assigned on each phone which was modified each time according to the extracted rules. Models of this type and similar to this were developed in many languages such as French (Bartkova and Sorin, 1987), Swedish (Carlson and Granstrom, 1986), German (Kohler, 1988) and Greek (Epitropakis et al, 1993;Yiourgalis and Kokkinakis, 1996), as well as in several dialects such as American English (Allen et al, 1987;Olive and Liberman, 1985) and Brazilian Portuguese (Simoes, 1990). The main disadvantage of the rule-based approaches is the difficulty to represent and tune manually all the linguistic factors, such as the phonetic, the morphological and the syntactic ones, which influence the segmental duration in speech.…”
Section: Introductionmentioning
confidence: 94%
“…Starting from some intrinsic value, the duration of a segment is modified by successively applied rules. Models of this type have been developed for several languages including American English [1,13], Swedish [4], German [9], French [2], and Brazilian Portuguese [17]. When large speech databases and the computational means for analyzing these corpora became available, new approaches were proposed based on, for example, Classification and Regression Trees (CART) [14,15] and neural networks [3].…”
Section: Introductionmentioning
confidence: 99%