Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop 2020
DOI: 10.18653/v1/2020.acl-srw.39
|View full text |Cite
|
Sign up to set email alerts
|

Cross-Lingual Disaster-related Multi-label Tweet Classification with Manifold Mixup

Abstract: Distinguishing informative and actionable messages from a social media platform like Twitter is critical for facilitating disaster management. For this purpose, we compile a multilingual dataset of over 130K samples for multilabel classification of disaster-related tweets. We present a masking-based loss function for partially labeled samples and demonstrate the effectiveness of Manifold Mixup in the text domain. Our main model is based on Multilingual BERT, which we further improve with Manifold Mixup. We sho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
14
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 22 publications
(14 citation statements)
references
References 23 publications
0
14
0
Order By: Relevance
“…For MLP, we represent an input example as the average of the embeddings of its tokens, and denote the method as MLP-BoW. We also use DenseCNN [14] and XML-CNN [18] following [23] for disaster and DPCNN [10] for sentiment datasets. The results are presented in Table 4.…”
Section: Resultsmentioning
confidence: 99%
“…For MLP, we represent an input example as the average of the embeddings of its tokens, and denote the method as MLP-BoW. We also use DenseCNN [14] and XML-CNN [18] following [23] for disaster and DPCNN [10] for sentiment datasets. The results are presented in Table 4.…”
Section: Resultsmentioning
confidence: 99%
“…We highlight a few works which explore disaster-related tweet classification in multilingual setting. One of the comprehensive works in this area was done by Raychowdhury et al in [25] which explore disaster-related text classification by applying Manifold Mixup [28] on mBERT. They aggregated multiple disaster datasets containing tweets in multiple languages into a single large dataset and performed their experiments on that dataset.…”
Section: Related Workmentioning
confidence: 99%
“…With regard to input representations, crisis-specific embeddings have been shown to be particularly effective (Nguyen et al, 2016). Recently, work has shown the effectiveness of deep contextualized word representations for classifying disaster-based tweets (Madichetty and M, 2020), particularly with regard to classifying over multiple disaster types (Zahera et al, 2019;Ray Chowdhury et al, 2020).…”
Section: Classification In Crisismentioning
confidence: 99%