2022
DOI: 10.36227/techrxiv.21309600.v1
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Residual Multi-Scale Convolutional Transformer Network with Chunk-level Log-Mel Spectrograms for Speech Emotion Recognition

Abstract: <p>The great variety of human emotional expression as well as the differences in the ways they perceive and annotate them make Speech Emotion Recognition (SER) an ambiguous and challenging task. With the development of deep learning, long-term progress has been made in supervised SER systems. However, the existing convolutional neural networks present certain limitations, such as their inability to well capture global features, which contain important emotional information. In addition, due to the subjec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 56 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?