Detecting Sarcasm in Multimodal Social Platforms

Schifanella, Rossano; Juan, Paloma de; Tetreault, Joel; Cao, Liangliang

doi:10.1145/2964284.2964321

Cited by 139 publications

(71 citation statements)

References 27 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…However, this work did not analyze the interplay of the modalities. More recently, Schifanella et al (2016) presented a multimodal approach for this task by considering vi-sual content accompanying text in online sarcastic posts. They extracted semantic visual features from images using pre-trained networks and fused them with textual features.…”

Section: Related Workmentioning

confidence: 99%

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

Castro¹,

Hazarika²,

Pérez-Rosas³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

137

107

View full text Add to dashboard Cite

Sarcasm is often expressed through several verbal and non-verbal cues, e.g., a change of tone, overemphasis in a word, a drawn-out syllable, or a straight looking face. Most of the recent work in sarcasm detection has been carried out on textual data. In this paper, we argue that incorporating multimodal cues can improve the automatic classification of sarcasm. As a first step towards enabling the development of multimodal approaches for sarcasm detection, we propose a new sarcasm dataset, Multimodal Sarcasm Detection Dataset (MUS-tARD 1 ), compiled from popular TV shows. MUStARD consists of audiovisual utterances annotated with sarcasm labels. Each utterance is accompanied by its context of historical utterances in the dialogue, which provides additional information on the scenario where the utterance occurs. Our initial results show that the use of multimodal information can reduce the relative error rate of sarcasm detection by up to 12.9% in F-score when compared to the use of individual modalities. The full dataset is publicly available for use at https://github. com/soujanyaporia/MUStARD.

show abstract

Section: Related Workmentioning

confidence: 99%

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

Castro¹,

Hazarika²,

Pérez-Rosas³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

137

107

View full text Add to dashboard Cite

show abstract

“…Roy et al [4] concatenate image features from a CNN model and text features from a Doc2Vec model and use them together to train a fully connected neural networks to identify social media posts related to illicit drugs. Schifanella et al [5] use visual semantics from a CNN model and text features from an NLP network model together to train traditional models, SVM and DNN, respectively. After that, they leverage the trained models to detect sarcastic social media posts.…”

Section: Human Activity Recognition Using Social Mediamentioning

confidence: 99%

“…However, its use of uni-modal textual features cannot capture enough patterns of human activities shared on the social media because users mostly describe their daily activities and thoughts using both texts and images. Such a limitation can be relieved by incorporating the inherent multi-modality of social media into the learning process as in [4,5]. These multi-modal approaches adopt an early fusion technique which leverages concatenated features of text and image to their proposed classifiers.…”

Section: Introductionmentioning

confidence: 99%

Human Activity Recognition Using Semi-supervised Multi-modal DEC for Instagram Data

Kim

Han

Son

et al. 2020

Advances in Knowledge Discovery and Data Mining

View full text Add to dashboard Cite

Human Activity Recognition (HAR) using social media provides a solid basis for a variety of context-aware applications. Existing HAR approaches have adopted supervised machine learning algorithms using texts and their meta-data such as time, venue, and keywords. However, their recognition accuracy may decrease when applied to imagesharing social media where users mostly describe their daily activities and thoughts using both texts and images. In this paper, we propose a semi-supervised multi-modal deep embedding clustering method to recognize human activities on Instagram. Our proposed method learns multi-modal feature representations by alternating a supervised learning phase and an unsupervised learning phase. By utilizing a large number of unlabeled data, it learns a more generalized feature distribution for each HAR class and avoids overfitting to limited labeled data. Evaluation results show that leveraging multi-modality and unlabeled data is effective for HAR and our method outperforms existing approaches.

show abstract

“…They introduce a complex classification model that works over an entire tweet sequence and not on one tweet at a time. Integration between linguistic and contextual features extracted from the analysis of visuals embedded in multimodal posts was deployed for sarcasm detection [23]. A framework based on the linguistic theory of context incongruity and an introduction of inter-sentential incongruity by considering the history of the posts in the discussion thread was considered for sarcasm detection [11].…”

Section: Context-based Approachmentioning

confidence: 99%

Sarcasm as a Contradiction Between a Tweet and its Temporal Facts : A Pattern Based Approach

Bharti¹,

Babu²

2018

IJNLC

View full text Add to dashboard Cite

In the context of Indian languages, sarcasm detection in Hindi is a tedious job as it is rich in morphology and complex in structure. The annotated resources for sarcastic Hindi sentences are almost negligible for machine learning analysis. Here, we propose a pattern-based framework for sarcasm detection in Hindi tweets. It has been observed that a tweet is sarcastic if it contradicts its temporal facts intentionally. The temporal fact is a collection of time-dependent facts which may change over the period. We used Hindi news with timestamp as a corpus of temporal facts. The timestamp describes the fact period of any entity. In this research, a temporal fact is represented as a pair. To form a pair, one need to extract triplets i.e. subject, verb and object for every sentence. Next, a key is formed using subject and verb together. The value is formed using object and timestamp together. To predict the sarcastic tweet; one needs to extract the triplets from input tweet and form a pair. Now, the pair of the input tweet is mapped with related pair in the corpus of temporal facts and are checked if they coincide. If they contradict, the input tweet is considered as sarcastic. The achieved accuracy of the proposed approach outperforms the state-of-the-arts techniques for Hindi sarcasm detection as it attains an accuracy of 82.8%.

show abstract

Detecting Sarcasm in Multimodal Social Platforms

Cited by 139 publications

References 27 publications

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

Human Activity Recognition Using Semi-supervised Multi-modal DEC for Instagram Data

Sarcasm as a Contradiction Between a Tweet and its Temporal Facts : A Pattern Based Approach

Contact Info

Product

Resources

About