2021
DOI: 10.1007/s10772-020-09792-x
|View full text |Cite
|
Sign up to set email alerts
|

Fusion of mel and gammatone frequency cepstral coefficients for speech emotion recognition using deep C-RNN

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
10
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 64 publications
(10 citation statements)
references
References 29 publications
0
10
0
Order By: Relevance
“…This idea has profoundly impacted subsequent research on effective teaching and learning. Kumaran et al later proposed a classification theory of educational goals, which requires a classification system based on emotions [ 10 ]. The classification dimensions are mainly reflected in five levels: acceptance, response, value judgment, organization, and characterization of value and value complexes, all of which have specific emotional meanings and sublevels corresponding to the level.…”
Section: Related Workmentioning
confidence: 99%
“…This idea has profoundly impacted subsequent research on effective teaching and learning. Kumaran et al later proposed a classification theory of educational goals, which requires a classification system based on emotions [ 10 ]. The classification dimensions are mainly reflected in five levels: acceptance, response, value judgment, organization, and characterization of value and value complexes, all of which have specific emotional meanings and sublevels corresponding to the level.…”
Section: Related Workmentioning
confidence: 99%
“…The literature [17] optimized the inception structure by incorporating LSTM and proposed OI-LSTM model, which has excellent recognition effect and also the model has good fault tolerance. The literature [18] proposed the recursive multilevel fusion network (RMFN), which decomposes the spatio-temporal fusion problem into multiple stages, each focusing on a subset of multimodal signals for specialized and efficient fusion. The literature [19] proposes a multimodal information fuzzy fusion algorithm.…”
Section: Related Workmentioning
confidence: 99%
“…To make Sphinx-4 be able to recognize concatenation, it first needs to be able to accept concatenated grammar and then build a search grid based on the input grammar nodes, so that when the input speech contains concatenated components, it can be detected and recognized by the system. This is achieved by adding methods to the original class that can dynamically generate new grammar nodes so that the system can accept grammar nodes processed by concatenation rules [25]. The more important parameter in the process of fundamental frequency feature extraction is the setting of the frequency range of the band-pass filter, which is set to 52 and 620, respectively, and the window function adopts Gaussian function.…”
Section: Experimental Design For Automatic Assessment Of Spokenmentioning
confidence: 99%