2023
DOI: 10.1007/978-3-031-30675-4_43
|View full text |Cite
|
Sign up to set email alerts
|

Wukong-CMNER: A Large-Scale Chinese Multimodal NER Dataset with Images Modality

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2024
2024
2025
2025

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(1 citation statement)
references
References 33 publications
0
1
0
Order By: Relevance
“…However, direct alignment of cross-modality only encourages integrating information from different modalities while mixing the noise from each modality irrelevant to our task. Thus, we use the mutual information estimator MINE (Belghazi et al 2018) to enhance mutual information, which can be utilized to mine the modal-invariant information between different modalities and filter out modality-specific random noise (Qi and Qin 2023;Bao et al 2023). Specifically, we maximize the MI between joint embedding h j and uni-modal embedding h m :…”
Section: Mutual Information Enhanced Cross-modality Alignmentmentioning
confidence: 99%
“…However, direct alignment of cross-modality only encourages integrating information from different modalities while mixing the noise from each modality irrelevant to our task. Thus, we use the mutual information estimator MINE (Belghazi et al 2018) to enhance mutual information, which can be utilized to mine the modal-invariant information between different modalities and filter out modality-specific random noise (Qi and Qin 2023;Bao et al 2023). Specifically, we maximize the MI between joint embedding h j and uni-modal embedding h m :…”
Section: Mutual Information Enhanced Cross-modality Alignmentmentioning
confidence: 99%