Anger recognition in speech using acoustic and linguistic cues

Polzehl, Tim; Schmitt, Alexander; Metze, Florian; Wagner, Michael

doi:10.1016/j.specom.2011.05.002

Cited by 93 publications

(30 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Voice features could be divided into two categories: Acoustic and linguistic features [21]. However, since we are aiming to find general characteristics for depressed speech regardless of the language used, linguistic features are not being analysed here.…”

Section: Feature Extractionmentioning

confidence: 99%

Detecting depression: A comparison between spontaneous and read speech

Alghowinem

Goecke

Wagner

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Self Cite

100

View full text Add to dashboard Cite

Major depressive disorders are mental disorders of high prevalence, leading to a high impact on individuals, their families, society and the economy. In order to assist clinicians to better diagnose depression, we investigate an objective diagnostic aid using affective sensing technology with a focus on acoustic features. In this paper, we hypothesise that (1) classifying the general characteristics of clinical depression using spontaneous speech will give better results than using read speech, (2) that there are some acoustic features that are robust and would give good classification results in both spontaneous and read, and (3) that a 'thin-slicing' approach using smaller parts of the speech data will perform similarly if not better than using the whole speech data. By examining and comparing recognition results for acoustic features on a real-world clinical dataset of 30 depressed and 30 control subjects using SVM for classification and a leave-one-out cross-validation scheme, we found that spontaneous speech has more variability, which increases the recognition rate of depression. We also found that jitter, shimmer, energy and loudness feature groups are robust in characterising both read and spontaneous depressive speech. Remarkably, thin-slicing the read speech, using either the beginning of each sentence or the first few sentences performs better than using all reading task data.

show abstract

Section: Feature Extractionmentioning

confidence: 99%

Detecting depression: A comparison between spontaneous and read speech

Alghowinem

Goecke

Wagner

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Self Cite

100

View full text Add to dashboard Cite

show abstract

“…For feature extraction, voice features can be categorised into acoustic and linguistic features [15]. Acoustic features can also be categorised into low-level descriptors (LLD) and statistical functionals, which are calculated based on the LLD over certain units (e.g.…”

Section: Real-world Clinically Validated Datamentioning

confidence: 99%

A comparative study of different classifiers for detecting depression from spontaneous speech

Alghowinem

Goecke

Wagner

et al. 2013

2013 IEEE International Conference on Acoustics, Speech and Signal Processing

Self Cite

View full text Add to dashboard Cite

Accurate detection of depression from spontaneous speech could lead to an objective diagnostic aid to assist clinicians to better diagnose depression. Little thought has been given so far to which classifier performs best for this task. In this study, using a 60-subject real-world clinically validated dataset, we compare three popular classifiers from the affective computing literature -Gaussian Mixture Models (GMM), Support Vector Machines (SVM) and Multilayer Perceptron neural networks (MLP) -as well as the recently proposed Hierarchical Fuzzy Signature (HFS) classifier. Among these, a hybrid classifier using GMM models and SVM gave the best overall classification results. Comparing feature, score, and decision fusion, score fusion performed better for GMM, HFS and MLP, while decision fusion worked best for SVM (both for raw data and GMM models). Feature fusion performed worse than other fusion methods in this study. We found that loudness, root mean square, and intensity were the voice features that performed best to detect depression in this dataset.

show abstract

“…Abrupt changes in the spectrum is captured by calculating the spectral flux [19] in which power spectrum for one frame is compared against the power spectrum from the previous frame for a total number of A samples/frame given by Eq. (3).…”

Section: Spectral Fluxmentioning

confidence: 99%

Emotion Recognition from Speech using Discriminative Features

Chandrasekar¹,

Chapaneri²,

Jayaswal³

2014

IJCA

View full text Add to dashboard Cite

Creating an accurate Speech Emotion Recognition (SER) system depends on extracting features relevant to that of emotions from speech. In this paper, the features that are extracted from the speech samples include Mel Frequency Cepstral Coefficients (MFCC), energy, pitch, spectral flux, spectral roll-off and spectral stationarity. In order to avoid the 'curse of dimensionality', statistical parameters, i.e. mean, variance, median, maximum, minimum, and index of dispersion have been applied on the extracted features. For classifying the emotion in an unknown test sample, Support Vector Machines (SVM) has been chosen due to its proven efficiency. Through experimentation on the chosen features, an average classification accuracy of 86.6% has been achieved using one-v/s-all multi-class SVM which is further improved to 100% when reduced to binary form problem. Classifier metrics viz. precision, recall, and F-score values show that the proposed system gives improved accuracy for Emo-DB. General TermsPattern Recognition

show abstract

Anger recognition in speech using acoustic and linguistic cues

Cited by 93 publications

References 24 publications

Detecting depression: A comparison between spontaneous and read speech

Detecting depression: A comparison between spontaneous and read speech

A comparative study of different classifiers for detecting depression from spontaneous speech

Emotion Recognition from Speech using Discriminative Features

Contact Info

Product

Resources

About