Cross-Modal Sentiment Analysis Based on CLIP Image-Text Attention Interaction

Lu, Xintao; Ni, Yonglong; Ding, Zuohua

doi:10.14569/ijacsa.2024.0150290

IJACSA

2024

DOI: 10.14569/ijacsa.2024.0150290

|View full text |Cite

Cross-Modal Sentiment Analysis Based on CLIP Image-Text Attention Interaction

Xintao Lu,

Yonglong Ni,

Zuohua Ding

Abstract: Multimodal sentiment analysis is a traditional textbased sentiment analysis technique. However, the field of multimodal sentiment analysis still faces challenges such as inconsistent cross-modal feature information, poor interaction capabilities, and insufficient feature fusion. To address these issues, this paper proposes a cross-modal sentiment model based on CLIP image-text attention interaction. The model utilizes pre-trained ResNet50 and RoBERTa to extract primary image-text features. After contrastive le… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2024

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Image sentiment analysis based on distillation and sentiment region localization network

Zhang,

Feng,

Yuan

et al. 2024

The Computer Journal

View full text Add to dashboard Cite

Accurately identifying the emotions in images is crucial for sentiment content analysis. To detect local sentiment regions and acquire discriminative sentiment features, we propose a novel model named Distillation-guided and Contrastive-enhanced Sentiment Region Localization Network (DC-SRLN) to effectively complete image sentiment analysis. Two smart but heterogeneous SRLNs are designed first to pursue local sentiment regions. Then an innovative contrastive learning mode is implemented between global and local features to further enhance the discriminative ability of the sentiment features. Third, the enhanced global and local sentiment features are seamlessly integrated to guide each SRLN accurately capture local sentiment regions. Finally, an adaptive feature fusion module is created to fuse the heterogeneous features from the two SRLNs and generate a new multi-view multi-granularity sentiment semantics with more discriminative ability for image sentiment analysis. Extensive experimental results on three prevailing datasets, namely Twitter I, FI, and ArtPhoto, exhibit that DC-SRLN achieves satisfactory accuracies of 93.2%, 80.6%, and 78.7%, respectively, outperforming recent state-of-the-art baselines. Moreover, DC-SRLN needs less training time, demonstrating its high practicality. The code of DC-SRLN is freely available at https://github.com/Riley6868/DC-SRLN.

show abstract

Image sentiment analysis based on distillation and sentiment region localization network

Zhang,

Feng,

Yuan

et al. 2024

The Computer Journal

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Cross-Modal Sentiment Analysis Based on CLIP Image-Text Attention Interaction

Cited by 1 publication

References 45 publications

Image sentiment analysis based on distillation and sentiment region localization network

Image sentiment analysis based on distillation and sentiment region localization network

Contact Info

Product

Resources

About