2017
DOI: 10.48550/arxiv.1708.02099
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Multimodal Classification for Analysing Social Media

Abstract: Classification of social media data is an important approach in understanding user behavior on the Web. Although information on social media can be of different modalities such as texts, images, audio or videos, traditional approaches in classification usually leverage only one prominent modality. Techniques that are able to leverage multiple modalities are often complex and susceptible to the absence of some modalities. In this paper, we present simple models that combine information from different modalities… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
12
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
1

Relationship

0
6

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 19 publications
0
12
0
Order By: Relevance
“…On the other hand, although the number of toxic interactions is smaller, they are richer in content as well as multimodal elements, compared to non-toxic interactions [23] (see Tables 5a, 5b, 5c, 5d, and 7). Prior research shows that appropriate incorporation of multimodal elements in modeling with social media data would improve performance [23,24,12,32]. In Table 5a, we see mean and maximum number of tweets per interaction for Toxic ones being significantly higher than Non-toxic ones, suggesting the intensity of the toxic content.…”
Section: Toxicmentioning
confidence: 84%
See 1 more Smart Citation
“…On the other hand, although the number of toxic interactions is smaller, they are richer in content as well as multimodal elements, compared to non-toxic interactions [23] (see Tables 5a, 5b, 5c, 5d, and 7). Prior research shows that appropriate incorporation of multimodal elements in modeling with social media data would improve performance [23,24,12,32]. In Table 5a, we see mean and maximum number of tweets per interaction for Toxic ones being significantly higher than Non-toxic ones, suggesting the intensity of the toxic content.…”
Section: Toxicmentioning
confidence: 84%
“…In Table 5a, we see mean and maximum number of tweets per interaction for Toxic ones being significantly higher than Non-toxic ones, suggesting the intensity of the toxic content. Further, according to Tables 5a, 5b, 5c, 5d, and 7, in the Toxic content, the use of multimodal elements such as image, video, and emoji, is clearly higher, suggesting that the incorporation of these different modalities in the analysis of this dataset will be critical for a reliable outcome [24,23,12,32].…”
Section: Toxicmentioning
confidence: 99%
“…There have been a wide variety of neural network-based multi-modal methods and applications. These methods can be roughly categorized into three groups, methods that use network fusion (concatenation) [1,2,3], methods that use gating [4], and methods that use cross-modal training [5]. For time series, it is especially common to combine text with images in document recognition [3,4], natural scene image recognition [1], and cross-modal retrieval [6,7].…”
Section: Related Workmentioning
confidence: 99%
“…These methods can be roughly categorized into three groups, methods that use network fusion (concatenation) [1,2,3], methods that use gating [4], and methods that use cross-modal training [5]. For time series, it is especially common to combine text with images in document recognition [3,4], natural scene image recognition [1], and cross-modal retrieval [6,7]. Combining audio with video is another common use for multi-modal networks [2,8,9].…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation