2022
DOI: 10.1016/j.compeleceng.2022.108002
|View full text |Cite
|
Sign up to set email alerts
|

Cross-modal fusion for multi-label image classification with attention mechanism

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0
1

Year Published

2022
2022
2025
2025

Publication Types

Select...
9
1

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(6 citation statements)
references
References 16 publications
0
5
0
1
Order By: Relevance
“…The attention mechanism in deep learning simulates the way human attention works, focusing on more important regional information and reducing the interference from unimportant information. In the process of image processing, more attention is paid to regions that match specific features, thus achieving efficient and rational allocation of computational resources [29,30]. Currently, attention mechanisms have been widely applied in object detection models.…”
Section: Simam Attention Modulementioning
confidence: 99%
“…The attention mechanism in deep learning simulates the way human attention works, focusing on more important regional information and reducing the interference from unimportant information. In the process of image processing, more attention is paid to regions that match specific features, thus achieving efficient and rational allocation of computational resources [29,30]. Currently, attention mechanisms have been widely applied in object detection models.…”
Section: Simam Attention Modulementioning
confidence: 99%
“…Cross-modal attention mechanism methods can be realized by dot product attention, bilinear attention, or multi-head attention, etc. [14]. 4) Cross-modal Pre-training Methods: By pre-training on large-scale multimodal data, cross-modal representation capabilities are learned and then fine-tuned on specific tasks.…”
Section: Related Work 21 Sentiment Analysis Of Multimodal Datamentioning
confidence: 99%
“…When processing images, it can pay more attention to image regions that can match target features and ignore background regions that cannot match. In other words, it can achieve a reasonable allocation of information resources [26,27]. Current attention mechanisms are more widely used in CNN architectures, including the channel attention mechanism, SENet [28] and the spatial attention mechanism, CBAM [29].…”
Section: Optimization At the Stage Of Feature Outputmentioning
confidence: 99%