2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07 2007
DOI: 10.1109/icassp.2007.367261
|View full text |Cite
|
Sign up to set email alerts
|

Stress and Emotion Classification using Jitter and Shimmer Features

Abstract: In this paper, we evaluate the use of appended jitter and shimmer speech features for the classification of human speaking styles and of animal vocalization arousal levels. Jitter and shimmer features are extracted from the fundamental frequency contour and added to baseline spectral features, specifically Mel-frequency cepstral coefficients (MFCCs) for human speech and Greenwood function cepstral coefficients (GFCCs) for animal vocalizations. Hidden Markov models (HMMs) with Gaussian mixture models (GMMs) sta… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
54
0

Year Published

2009
2009
2024
2024

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 108 publications
(54 citation statements)
references
References 7 publications
0
54
0
Order By: Relevance
“…They are commonly measured for long sustained vowels, and values of jitter and shimmer above a certain threshold are considered being related to pathological voices, which are usually perceived by humans as breathy, rough or hoarse voices. More recently, they have also been used to determine the classification of human speaking styles [36] and the age and gender of the speakers [11]. Absolute jitter values, for instance, are found larger in males, while…”
Section: Jitter and Shimmermentioning
confidence: 99%
“…They are commonly measured for long sustained vowels, and values of jitter and shimmer above a certain threshold are considered being related to pathological voices, which are usually perceived by humans as breathy, rough or hoarse voices. More recently, they have also been used to determine the classification of human speaking styles [36] and the age and gender of the speakers [11]. Absolute jitter values, for instance, are found larger in males, while…”
Section: Jitter and Shimmermentioning
confidence: 99%
“…From literature and our experiments follows that different types of emotions are manifested not only in prosodic patterns (F0, energy, duration) and several voice quality features (e.g. jitter, shimmer, glottal-to-noise excitation ratio, Hammarberg index) (Li et al 2007) but also by significant changes in spectral domain (Nwe et al 2003). Several spectral features (spectral centroid, spectral flatness measure, Renyi entropy, etc.)…”
Section: Introductionmentioning
confidence: 99%
“…On the other hand, Slyh et al (1999) and Li et al (2005) reported that significant differences can occur in jitter and shimmer measurements between different speaking styles, especially in shimmer measurement. Nevertheless, prosody is also highly-dependent on the emotion of the speaker, and prosodic features are useful in automatic recognition systems even when no emotional state is distinguished, which leads to the hypothesis that jitter and shimmer features can be also useful in the speaker recognition task.…”
Section: Jitter and Shimmermentioning
confidence: 99%
“…Jitter and shimmer, for example, have been largely used to detect pathological and characteristic voices like breathy, rough or hoarse voices (Michaelis et al, 1998;Kreiman and Gerrat, 2005). More recently, they have also been used to determine the age and gender of the speakers (Wittig and Müller, 2005) and the classification of human speaking styles (Li et al, 2005).…”
Section: Linguistic Levelsmentioning
confidence: 99%