International Conference on Multimodal Interaction 2022
DOI: 10.1145/3536220.3558038
|View full text |Cite
|
Sign up to set email alerts
|

Investigating Transformer Encoders and Fusion Strategies for Speech Emotion Recognition in Emergency Call Center Conversations.

Abstract: The emotion detection technology to enhance human decision-making is an important research issue for real-world applications, but real-life emotion datasets are relatively rare and small. The experiments conducted in this paper use the CEMO, which was collected in a French emergency call center. Two pre-trained models based on speech and text were fine-tuned for speech emotion recognition. Using pre-trained Transformer encoders mitigates our data's limited and sparse nature. This paper explores the different f… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2025
2025

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 14 publications
(1 citation statement)
references
References 47 publications
0
1
0
Order By: Relevance
“…. Deschamps-Berger et al [3] investigate the use of pre-trained and fne-tuned Transformer models for audio and text modalities for emotion recognition. It provides a use-case of how to apply pre-trained machine learning architectures to deal with limited data available in specifc contexts.…”
Section: Investigating Transformer Encoders and Fusion Strategiesmentioning
confidence: 99%
“…. Deschamps-Berger et al [3] investigate the use of pre-trained and fne-tuned Transformer models for audio and text modalities for emotion recognition. It provides a use-case of how to apply pre-trained machine learning architectures to deal with limited data available in specifc contexts.…”
Section: Investigating Transformer Encoders and Fusion Strategiesmentioning
confidence: 99%