The IBM expressive text-to-speech synthesis system for American English

Pitrelli, John F.; Bakis, Raimo; Eide, Ellen; Fernandez, Raul Castro; Hamza, Wael; Picheny, Michael

doi:10.1109/tasl.2006.876123

Cited by 111 publications

(55 citation statements)

References 6 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The US-based ESS requires a large scale corpus of emotional speech, and emotional property of synthetic speech is colored by acoustic correlates represented by inventories in a corpus [2,[47][48][49][50][51]. The quality of ESS is dependent on the corpus and how to select inventories from the corpus.…”

Section: Corpus-based Approachmentioning

confidence: 99%

“…This approach is simple but it needs large costs for constructing corpora. Pitrelli et al proposed another approach of the USbased ESS by building a corpus that mixed several emotions and by introducing the emotion as a feature for inventory selection [49]. Moriyama et al proposed an idea for representing relationship among F 0 , energy, and duration using PCA on a subspace in ESS for Japanese words [50,51].…”

Section: Corpus-based Approachmentioning

confidence: 99%

See 1 more Smart Citation

A review of paralinguistic information processing for natural speech communication

Yamashita

2013

Acoust. Sci. & Tech.

View full text Add to dashboard Cite

Speech conveys not only linguistic information but also supplemental information that is not inferable from written language, such as attitude, speaking style, intention, emotion, mental state, and so on, and is called para-or non-linguistic information. This type of information plays important roles for smooth and natural communication through spoken language. This paper reviews recognition and synthesis techniques for speech communication focusing on emotion and emphasis as well as corpora that are dispensable to development of current speech technologies.

show abstract

Section: Corpus-based Approachmentioning

confidence: 99%

Section: Corpus-based Approachmentioning

confidence: 99%

A review of paralinguistic information processing for natural speech communication

Yamashita

2013

Acoust. Sci. & Tech.

View full text Add to dashboard Cite

show abstract

“…1 shows a set of warping functions, depending on the values of ζ. It is clear that λ ∈ [1,2] can be mapped to multiple F0 values in [f0 l , f0 h ] when ζ is altered. This forms the basis of frequency modulation in that λ and ζ can be used to represent the observed F0 contours and the adjusting proportions, respectively.…”

Section: Frequency Modulationmentioning

confidence: 99%

“…Furthermore, significant progress has been made in corpus-based unit concatenative synthesis technology [2] [3]. These two things have led to an improvement in voice quality of synthetic speech, which in turn has led to it becoming more common.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Frequency Modulation Technique for Prosodic Modification

Sakai

Shimizu

et al. 2008

2008 6th International Symposium on Chinese Spoken Language Processing

View full text Add to dashboard Cite

Modulation of speaking tone in frequency can make speech interesting and convey subtle meaning in communication. We present a frequency modulation (FM) technique for prosodic modification to consider communicative speech synthesis. This technique provides a mathematical formulation for representing speaking tone and manipulating FM in a unified framework. Two experiments are conducted with a text-to-speech system to which a module of FM-based prosodic modification is added. One is to enhance emphasis in words when synthesizing Chinese conversational speech. The other is to modify readingstyle prosody while conveying good and bad news in Japanese; this is done by using the FM technique to shift the frequency ranges and rescale the fundamental frequency contours jointly. The experimental results indicated that the native speakers identified 90% of samples with emphases and 78% of "good news" as well as 94% of "bad news" samples. The FM technique is vital for making synthetic speech communicative.

show abstract

A Concatenative Synthesis Based Speech Synthesiser for Hindi

Gupta

2008

Advances in Computer and Information Sciences and Engineering

View full text Add to dashboard Cite

The IBM expressive text-to-speech synthesis system for American English

Cited by 111 publications

References 6 publications

A review of paralinguistic information processing for natural speech communication

A review of paralinguistic information processing for natural speech communication

Frequency Modulation Technique for Prosodic Modification

A Concatenative Synthesis Based Speech Synthesiser for Hindi

Contact Info

Product

Resources

About