2024
DOI: 10.1609/aaai.v38i7.28538
|View full text |Cite
|
Sign up to set email alerts
|

Identification of Necessary Semantic Undertakers in the Causal View for Image-Text Matching

Huatian Zhang,
Lei Zhang,
Kun Zhang
et al.

Abstract: Image-text matching bridges vision and language, which is a fundamental task in multimodal intelligence. Its key challenge lies in how to capture visual-semantic relevance. Fine-grained semantic interactions come from fragment alignments between image regions and text words. However, not all fragments contribute to image-text relevance, and many existing methods are devoted to mining the vital ones to measure the relevance accurately. How well image and text relate depends on the degree of semantic sharing bet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2
2

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
references
References 36 publications
0
0
0
Order By: Relevance