2019 IEEE International Conference on Signal Processing, Information, Communication &Amp; Systems (SPICSCON) 2019
DOI: 10.1109/spicscon48833.2019.9065172
|View full text |Cite
|
Sign up to set email alerts
|

Convolutional Neural Network (CNN) Based Speech-Emotion Recognition

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
8
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 39 publications
(11 citation statements)
references
References 8 publications
0
8
0
Order By: Relevance
“…Precision is the ability of our model to check the correctly predicted positives from all the predicted positives and is given by Eq. (5). Our model has the highest precision of 95% for surprise emotion and lowest for happy emotion with precision of 82%.…”
Section: Resultsmentioning
confidence: 77%
See 1 more Smart Citation
“…Precision is the ability of our model to check the correctly predicted positives from all the predicted positives and is given by Eq. (5). Our model has the highest precision of 95% for surprise emotion and lowest for happy emotion with precision of 82%.…”
Section: Resultsmentioning
confidence: 77%
“…T ruePositive T ruePositive + FalsePositive (5) Recall measures the model ability to check the correct positive from all the existing positives in the test dataset and is given by Eq. (6).…”
Section: Precision =mentioning
confidence: 99%
“…Over the last few years, researchers have introduced different deep learning algorithms using various discriminative features [18][19][20]. At present, deep learning methods still use several low-level descriptor (LLD) features, which are different from traditional SER features [21]. So, extracting more detailed and relevant emotional information from speech is the first issue we have to address.…”
Section: Introductionmentioning
confidence: 99%
“…Some of them are convolutional neural network (CNN), recurrent neural network (RNN), and support vector machine (SVM) [10], [14], [17] among those classification models, CNN is confident in processing pictures and applying them to speech emotion recognition since the extracted audio features form can be a picture. Prior research of speech emotion recognition shows that CNN produces higher accuracy than RNN and SVM models [14], [18]. This condition allows us to choose CNN as the platform for the speech emotion recognition model.…”
mentioning
confidence: 99%
“…Most speech emotion recognition research uses datasets equipped with emotion labels. Some of the datasets used by researchers are Ryerson audio-visual database of emotional speech and song (RAVDESS), surrey audio-visual expressed emotion (SAVEE), The interactive emotional dyadic motion capture (IEMOCAP), Toronto emotional speech set (TESS), CASIA, dan EMO [9], [11], [14], [15], [22], [23]. The work to combine datasets is expanded by de Pinto et al [15] to gain a model with better performance.…”
mentioning
confidence: 99%