1993
DOI: 10.1121/1.405558
|View full text |Cite
|
Sign up to set email alerts
|

Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion

Abstract: There has been considerable research into perceptible correlates of emotional state, but a very limited amount of the literature examines the acoustic correlates and other relevant aspects of emotion effects in human speech; in addition, the vocal emotion literature is almost totally separate from the main body of speech analysis literature. A discussion of the literature describing human vocal emotion, and its principal findings, are presented. The voice parameters affected by emotion are found to be of three… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

31
456
1
6

Year Published

1996
1996
2006
2006

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 874 publications
(494 citation statements)
references
References 0 publications
31
456
1
6
Order By: Relevance
“…That has been called the palette theory (Scherer, 1984b). The term primary is still widely used, in lay parlance and in the speech literature (e.g., Murray and Arnott, 1993); but in fact, ÔpaletteÕ theories have very little support in modern emotion research (Ekman, 1999).…”
Section: Lists Of Key Emotion Categoriesmentioning
confidence: 99%
“…That has been called the palette theory (Scherer, 1984b). The term primary is still widely used, in lay parlance and in the speech literature (e.g., Murray and Arnott, 1993); but in fact, ÔpaletteÕ theories have very little support in modern emotion research (Ekman, 1999).…”
Section: Lists Of Key Emotion Categoriesmentioning
confidence: 99%
“…Williams and Stevens (1972) concluded that the pitch contour is the best indicator ofthe emotional content ofan utterance. In their review ofthe literature, Murray and Arnott (1993) noted that the most commonly referenced vocal parameters are pitch (i.e., both the average value and range of the fundamental frequency), duration, intensity, and the undefined term voice quality.…”
mentioning
confidence: 99%
“…ÔAnxiousÕ utterances show segments that are shorter than average, with exception of voiceless plosives. Also in (Murray and Arnott, 1993), relations were shown between the emotion state and the duration of vowels and consonants. But in nearly all studies pitch and energy are the most commonly applied features to distinguish and classify emotion state (Murray and Arnott, 1993), or anyway to convey supra-textual information.…”
Section: Emotion and Asr Affective Computingmentioning
confidence: 99%
“…Also in (Murray and Arnott, 1993), relations were shown between the emotion state and the duration of vowels and consonants. But in nearly all studies pitch and energy are the most commonly applied features to distinguish and classify emotion state (Murray and Arnott, 1993), or anyway to convey supra-textual information. In Slaney and McRoberts, 1998, a study was conducted to automatically classify an utterance (spoken by a parent to a young infant) into three classes: approval, attention and prohibition.…”
Section: Emotion and Asr Affective Computingmentioning
confidence: 99%
See 1 more Smart Citation