Multi-Modal Sarcasm Detection with Sentiment Word Embedding

Fu, Hao; Liu, Hao; Wang, Hongling; Xu, Linyan; Lin, Jiali; Jiang, Dazhi

doi:10.3390/electronics13050855

Cited by 4 publications

(1 citation statement)

References 43 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Furthermore, they [37] present a cross-modal graph neural network in which the edge weights come from SenticNet to capture the inter-modal inconsistency. Jiang et al have embedded sentiment word into multimodal vectors [38]. These graph neural networks have achieved excellent performance but bring much computational trouble.…”

Section: Multimodal Sarcasm Detectionmentioning

confidence: 99%

A Multi-View Interactive Approach for Multimodal Sarcasm Detection in Social Internet of Things with Knowledge Enhancement

Liu,

Yang,

2024

Applied Sciences

Self Cite

View full text Add to dashboard Cite

Multimodal sarcasm detection is a developing research field in social Internet of Things, which is the foundation of artificial intelligence and human psychology research. Sarcastic comments issued on social media often imply people’s real attitudes toward the events they are commenting on, reflecting their current emotional and psychological state. Additionally, the limited memory of Internet of Things mobile devices has posed challenges in deploying sarcastic detection models. An abundance of parameters also leads to an increase in the model’s inference time. Social networking platforms such as Twitter and WeChat have generated a large amount of multimodal data. Compared to unimodal data, multimodal data can provide more comprehensive information. Therefore, when studying sarcasm detection on social Internet of Things, it is necessary to simultaneously consider the inter-modal interaction and the number of model parameters. In this paper, we propose a lightweight multimodal interaction model with knowledge enhancement based on deep learning. By integrating visual commonsense knowledge into the sarcasm detection model, we can enrich the semantic information of image and text modal representation. Additionally, we develop a multi-view interaction method to facilitate the interaction between modalities from different modal perspectives. The experimental results indicate that the model proposed in this paper outperforms the unimodal baselines. Compared to multimodal baselines, it also has similar performance with a small number of parameters.

show abstract

Section: Multimodal Sarcasm Detectionmentioning

confidence: 99%