2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) 2017
DOI: 10.1109/apsipa.2017.8282315
|View full text |Cite
|
Sign up to set email alerts
|

Speech emotion recognition using convolutional long short-term memory neural network and support vector machines

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
8
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 24 publications
(8 citation statements)
references
References 12 publications
0
8
0
Order By: Relevance
“…In this model, a speech emotion recognition is proposed using convolutional long short-term memory (LSTM) and recurrent neural network (RNN) by using a phoneme-based feature extractor. This technique yields outputs phoneme-based emotion probabilities to all frames of an input utterance [22]. Now-a-days there is an increasing attentiveness to implement DL techniques to understand aspects from an emotional database.…”
Section: ) Feature Classification Using Convolutional Lstm-rnnmentioning
confidence: 99%
See 1 more Smart Citation
“…In this model, a speech emotion recognition is proposed using convolutional long short-term memory (LSTM) and recurrent neural network (RNN) by using a phoneme-based feature extractor. This technique yields outputs phoneme-based emotion probabilities to all frames of an input utterance [22]. Now-a-days there is an increasing attentiveness to implement DL techniques to understand aspects from an emotional database.…”
Section: ) Feature Classification Using Convolutional Lstm-rnnmentioning
confidence: 99%
“…This model has only one LSTM layer comprising of nodes including the sigmoid activation function. During the result phase, the softmax activation function was taken into consideration for obtaining odds of 4 emotions, 36 phoneme-class based emotions, and 192 phoneme-based emotions respectively [22].…”
Section: On Considering Individual Parameters With Logisticmentioning
confidence: 99%
“…This resulted in 40 features that were then orthogonalised by a PCA. We used the architecture proposed in [17], and improved it by employing a variable network depth. For the training, we implemented a dropout layer after each block and a sparsity constraint.…”
Section: Feature Processingmentioning
confidence: 99%
“…In contrast, we addressed the low-accuracy issues and proposed a novel framework for the emotion recognition that utilized speech signals. We designed a system which recognized the local hidden emotional cues in the raw speech signal by utilizing the hierarchical ConvLSTM blocks [17]. We adopted two types of sequential learning in this framework.…”
Section: Introductionmentioning
confidence: 99%