2019
DOI: 10.1007/s00779-019-01246-9
|View full text |Cite
|
Sign up to set email alerts
|

A novel speech emotion recognition algorithm based on wavelet kernel sparse classifier in stacked deep auto-encoder model

Abstract: Since the contextual information has an important impact on the speaker's emotional state, how to use emotion-related context information to conduct feature learning is a key problem. The existing speech emotion recognition algorithms achieve the relatively high recognition rate; these algorithms are not very good application to the real-life speech emotion recognition systems. Therefore, in order to address the abovementioned issues, a novel speech emotion recognition algorithm based on improved stacked kerne… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
20
0
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
9

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(21 citation statements)
references
References 29 publications
0
20
0
1
Order By: Relevance
“…The proposed method is compared with the methods in [22], [26], [28], [30], and [31]. Experiment results are shown in Table 3, where the evaluation standard is the average of emotion recognition rate of discrete emotional states.…”
Section: Comparative Results Of Multi-modal Fusion Emotion Model Experimentsmentioning
confidence: 99%
See 1 more Smart Citation
“…The proposed method is compared with the methods in [22], [26], [28], [30], and [31]. Experiment results are shown in Table 3, where the evaluation standard is the average of emotion recognition rate of discrete emotional states.…”
Section: Comparative Results Of Multi-modal Fusion Emotion Model Experimentsmentioning
confidence: 99%
“…Grid search is also used to adjust the hyperparameters of each tested machine learning model through the Spark cluster to shorten the execution time. [31] proposed a speech emotion recognition algorithm based on the superposed sparse depth model. The improvement of this algorithm is based on the automatic encoder, denoising automatic encoder and sparse automatic encoder.…”
Section: (3)multimodal Emotion Recognition Methodsmentioning
confidence: 99%
“…MTC-AE contains multiple local DNNs based on different low-level descriptors with different statistical functions that are partly concatenated together, by which the structure is enabled to consider both local and global features simultaneously. Pengcheng Wei et al [30] proposed an algorithm based on an autoencoder, denoising autoencoder, and sparse autoencoder. The first layer of the structure uses a denoising autoencoder to learn a hidden feature with a larger dimension than the dimension of the input features, and the second layer employs a sparse autoencoder to learn sparse features.…”
Section: Related Workmentioning
confidence: 99%
“…A speech emotion recognition algorithm based on an improved stack kernel sparse depth model is proposed in Reference [13]. The algorithm is improved based on an automatic encoder, denoising automatic encoder, and sparse automatic encoder.…”
Section: Related Workmentioning
confidence: 99%