2015 International Conference on Affective Computing and Intelligent Interaction (ACII) 2015
DOI: 10.1109/acii.2015.7344651
|View full text |Cite
|
Sign up to set email alerts
|

Recognizing emotions in dialogues with acoustic and lexical features

Abstract: Automatic emotion recognition has long been a focus of Affective Computing. We aim at improving the performance of state-of-the-art emotion recognition in dialogues using novel knowledge-inspired features and modality fusion strategies. We propose features based on disfluencies and nonverbal vocalisations (DIS-NVs), and show that they are highly predictive for recognizing emotions in spontaneous dialogues. We also propose the hierarchical fusion strategy as an alternative to current feature-level and decision-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
6
0

Year Published

2016
2016
2020
2020

Publication Types

Select...
3
2
1

Relationship

1
5

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 39 publications
0
6
0
Order By: Relevance
“…The feature values are the ratios between the durations of DIS-NV events and the utterance duration. Compared to the AVEC2012 database of spontaneous dialogue, DIS-NVs are less frequent in the IEMOCAP database of acted dialogue, which limits their performance on emotion recognition in acted dialogue [21,44].…”
Section: Disfluency and Non-verbal Vocalisation (Dis-nv) Featuresmentioning
confidence: 99%
“…The feature values are the ratios between the durations of DIS-NV events and the utterance duration. Compared to the AVEC2012 database of spontaneous dialogue, DIS-NVs are less frequent in the IEMOCAP database of acted dialogue, which limits their performance on emotion recognition in acted dialogue [21,44].…”
Section: Disfluency and Non-verbal Vocalisation (Dis-nv) Featuresmentioning
confidence: 99%
“…and audio features 1 { } n q i i y R = ∈ , we could obtain the transformed features for arbitrary visual or audio feature by Equation (27). Based on these transformed features, we used feature-level and decision-level fusion strategies, respectively, to evaluate the effectiveness of the ICDKCFA for continuous dimensional emotion recognition.…”
Section: Methodsmentioning
confidence: 99%
“…Sayedelahl et al [20] presented a combined bi-modal feature-decision fusion approach to improve the performance of emotion recognition. Tian et al [27] proposed a hierarchical fusion strategy to combine features from different modalities at different layers of a hierarchical structure.…”
Section: Introductionmentioning
confidence: 99%
“…Mohamed R.Amer et al [48] used the deep networks technique and designed a model for speech emotion detection. Leimen Tian et al [49] used the LSTM technique and developed a model for recognizing emotions in dialogues with acoustic and lexical features. The authors concluded that this model had the potential to improve the quality of emotional interactions in current dialogue systems.…”
Section: Survey On Other Deep Learning Techniquesmentioning
confidence: 99%