2023
DOI: 10.3389/fcomp.2023.1039261
|View full text |Cite
|
Sign up to set email alerts
|

Task-specific speech enhancement and data augmentation for improved multimodal emotion recognition under noisy conditions

Abstract: Automatic emotion recognition (AER) systems are burgeoning and systems based on either audio, video, text, or physiological signals have emerged. Multimodal systems, in turn, have shown to improve overall AER accuracy and to also provide some robustness against artifacts and missing data. Collecting multiple signal modalities, however, can be very intrusive, time consuming, and expensive. Recent advances in deep learning based speech-to-text and natural language processing systems, however, have enabled the de… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 73 publications
0
1
0
Order By: Relevance
“…On the environmental robustness side, previous studies have shown that using signalbased enhancement techniques is usually insufficient, as the distortions introduced by these algorithms can also degrade the model performance (e.g., [11]). Unseen noise and reverberation levels, for example, are known to drastically reduce the accuracy of even state-of-the-art ASR systems [12] and can be highly sensitive to different environmental conditions [13,14].…”
Section: Introductionmentioning
confidence: 99%
“…On the environmental robustness side, previous studies have shown that using signalbased enhancement techniques is usually insufficient, as the distortions introduced by these algorithms can also degrade the model performance (e.g., [11]). Unseen noise and reverberation levels, for example, are known to drastically reduce the accuracy of even state-of-the-art ASR systems [12] and can be highly sensitive to different environmental conditions [13,14].…”
Section: Introductionmentioning
confidence: 99%