Proceedings of the Second Workshop on NLP for Positive Impact (NLP4PI) 2022
DOI: 10.18653/v1/2022.nlp4pi-1.20
|View full text |Cite
|
Sign up to set email alerts
|

Hate-CLIPper: Multimodal Hateful Meme Classification based on Cross-modal Interaction of CLIP Features

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(2 citation statements)
references
References 0 publications
0
2
0
Order By: Relevance
“…The solution's broad knowledge base, while extensive, is also less efficient compared to specialized models like QLoRA [2] or HateClipper [4], which fine-tune knowledge more effectively for specific tasks. In contrast however, this paper suggests distilling large VLMs into simpler forms as a viable strategy when detailed, task-specific data is unavailable.…”
Section: Discussion and Limitationsmentioning
confidence: 99%
See 1 more Smart Citation
“…The solution's broad knowledge base, while extensive, is also less efficient compared to specialized models like QLoRA [2] or HateClipper [4], which fine-tune knowledge more effectively for specific tasks. In contrast however, this paper suggests distilling large VLMs into simpler forms as a viable strategy when detailed, task-specific data is unavailable.…”
Section: Discussion and Limitationsmentioning
confidence: 99%
“…Llava's innovation to increase the input dimensions of images and allocate more tokens for encoding visual information significantly enhanced model performance. Using OpenAI's 4 CLIP technology, they managed to encode images up to 672x672 pixels from an original 336x336, marking a substantial improvement. Local testing on the MultiOFF dataset [10] with the adjusted resolution, resulted in a 6% improvement in AUROC, showcasing a direct correlation between enhanced image resolution and better meme understanding.…”
Section: Natural Language Understanding and Visual Understandingmentioning
confidence: 99%