2022
DOI: 10.33889/ijmems.2022.7.1.004
|View full text |Cite
|
Sign up to set email alerts
|

A Novel S-LDA Features for Automatic Emotion Recognition from Speech using 1-D CNN

Abstract: Emotions are explicit and serious mental activities, which find expression in speech, body gestures and facial features, etc. Speech is a fast, effective and the most convenient mode of human communication. Hence, speech has become the most researched modality in Automatic Emotion Recognition (AER). To extract the most discriminative and robust features from speech for Automatic Emotion Recognition (AER) recognition has yet remained a challenge. This paper, proposes a new algorithm named Shifted Linear Discrim… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 39 publications
0
3
0
Order By: Relevance
“…Additionally, when compared to other classifiers, the MLP classifier achieves superior results across all three datasets. Shifted Linear Discriminant Analysis (S-LDA) is proposed in paper [34] to derive dynamic attributes from static lowlevel variables like MFCC and Pitch. These adjusted features go into a 1D-CNN to extract high-level features for automatic event recognition (AER).…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…Additionally, when compared to other classifiers, the MLP classifier achieves superior results across all three datasets. Shifted Linear Discriminant Analysis (S-LDA) is proposed in paper [34] to derive dynamic attributes from static lowlevel variables like MFCC and Pitch. These adjusted features go into a 1D-CNN to extract high-level features for automatic event recognition (AER).…”
Section: Related Workmentioning
confidence: 99%
“…Furthermore, the spectral centroid yields a single scalar value at (34), indicating where most of the audio signal's energy lies in frequency; spectral contrast with seven scaler values at index (35 -41) represents the difference in amplitude between peaks and troughs in the audio spectrum. Then, the third spectral roll-off feature at index (42) with a single value marks the frequency below which a specific percentage (e.g., 85%) of the total spectral energy lies.…”
Section: B Speech Characteristicsmentioning
confidence: 99%
“…CNN has the ability of representation learning to pan unclassify the input information according to the hierarchical structure and it has been widely used in image classification (Bisht and Gupta, 2020), and speech recognition (Tiwari and Darji, 2022). 1D CNN, then, is an application of CNN model to the extraction of one-dimensional signals.…”
Section: Reliability Prediction Modelmentioning
confidence: 99%