2022
DOI: 10.1007/s11042-022-14051-z
|View full text |Cite
|
Sign up to set email alerts
|

A statistical feature extraction for deep speech emotion recognition in a bilingual scenario

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
4
0

Year Published

2023
2023
2025
2025

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(4 citation statements)
references
References 62 publications
0
4
0
Order By: Relevance
“…Certain features or activities lend themselves to particular methodologies more than others, and some methodologies may call for significant amounts of labelled data. Authors have proposed two methods: (a) based on a statistical-based parameterization framework for representing the speech through a fixed-length vector and (b) a deep learning approach that combines three convolutional neural networks architectures (1) . Their results achieved 87.08% and 83.90% for RAVDESS and EMOVO datasets respectively.…”
Section: Review Of Previous Work On Sermentioning
confidence: 99%
See 1 more Smart Citation
“…Certain features or activities lend themselves to particular methodologies more than others, and some methodologies may call for significant amounts of labelled data. Authors have proposed two methods: (a) based on a statistical-based parameterization framework for representing the speech through a fixed-length vector and (b) a deep learning approach that combines three convolutional neural networks architectures (1) . Their results achieved 87.08% and 83.90% for RAVDESS and EMOVO datasets respectively.…”
Section: Review Of Previous Work On Sermentioning
confidence: 99%
“…Speech emotion can be considered as a similar type of stress on every sound event during the speech. An emotive speech describes specific prosody in speech (1) . A language's prosodic rules change throughout time as a community's culture does.…”
Section: Introductionmentioning
confidence: 99%
“…Sekkate et al (2023) [14] introduced a statistical-based technique for speech emotion recognition. The researchers first converted the speech signals into MFCC where the mean values of MFCC were calculated as features.…”
mentioning
confidence: 99%
“…These algorithms utilize signal processing techniques, feature extraction, and machine learning to analyze acoustic properties and patterns related to emotional states. Acoustic features are extracted to capture the valence and arousal dimensions, including voice quality features [89][90][91][92]. Such AI-based visual and audio algorithms are integrated in Furhat robot, allowing it to display emotions and expressions on its face by animations that correspond with the emotional content and tone of the speech being delivered.…”
mentioning
confidence: 99%