2009
DOI: 10.1016/j.specom.2008.12.001
|View full text |Cite
|
Sign up to set email alerts
|

Automatic refinement of an expressive speech corpus assembling subjective perception and automatic classification

Abstract: tomatic refinement of an expressive speech corpus assembling subjective perception and automatic classification. Speech Communication, Elsevier : North-Holland, 2009, 51 (9) This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production pr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
19
0

Year Published

2009
2009
2013
2013

Publication Types

Select...
6
3

Relationship

0
9

Authors

Journals

citations
Cited by 23 publications
(19 citation statements)
references
References 23 publications
0
19
0
Order By: Relevance
“…The result of final multiple comparison of ANOVA also confirms good correlation between particular emotions. In the next future, we plan to use results of ANOVA and hypothesis test for creation of the database of values for emotional speech classifier based on statistical evaluation approach (Iriondo et al 2009), or it can be used for identification of speaker emotional states or in realtime emotion recognition systems (Attasi & Smékal 2008).…”
Section: Discussionmentioning
confidence: 99%
“…The result of final multiple comparison of ANOVA also confirms good correlation between particular emotions. In the next future, we plan to use results of ANOVA and hypothesis test for creation of the database of values for emotional speech classifier based on statistical evaluation approach (Iriondo et al 2009), or it can be used for identification of speaker emotional states or in realtime emotion recognition systems (Attasi & Smékal 2008).…”
Section: Discussionmentioning
confidence: 99%
“…Suprasegmental features comprise statistical values of parameters describing prosody by duration, fundamental frequency, and energy. Included in this category is also a separate group of features constituting the voice quality parameters: jitter, shimmer [11], Hammarberg index [12], LiljencrantsFant features [13], and spectral tilt [14]. All mentioned speech identification systems and classifiers are usually based on statistical approach, using the discriminative or artificial neural networks [15,16], hidden Markov models (HMM) [17], or Gaussian mixture models (GMM) [18,19].…”
Section: Introductionmentioning
confidence: 99%
“…The generation and, hence, the labeling of speech databases is one of the hot topics in affective US-TTS synthesis, e.g., [10]- [12] (see [13] for a review of previous emotional speech data collections). In this context, if an affective (or emotional) US-TTS system is considered for the generation of a talkinghead, the complexity of the corpus generation and annotation process increases.…”
Section: Introductionmentioning
confidence: 99%