2023
DOI: 10.1016/j.bspc.2023.105052
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal emotion recognition based on audio and text by using hybrid attention networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 25 publications
(1 citation statement)
references
References 65 publications
0
1
0
Order By: Relevance
“…The transformer based networks (Dosovitskiy et al, 2020) utilized the self-attention mechanism to build long-term relationships of dependency and could obtain competitive results in image recognition. It was noted that the transformer-based model [29][30][31][32] had mainly focused on improving the ability to extract the global context information and ignored the detailed information. MLP-Mixer [33] showed that pure MLP-based networks could achieve competitive performance in image segmentation since MLP can replace the self-attention mechanism in some extent.…”
Section: Introductionmentioning
confidence: 99%
“…The transformer based networks (Dosovitskiy et al, 2020) utilized the self-attention mechanism to build long-term relationships of dependency and could obtain competitive results in image recognition. It was noted that the transformer-based model [29][30][31][32] had mainly focused on improving the ability to extract the global context information and ignored the detailed information. MLP-Mixer [33] showed that pure MLP-based networks could achieve competitive performance in image segmentation since MLP can replace the self-attention mechanism in some extent.…”
Section: Introductionmentioning
confidence: 99%