2020
DOI: 10.48550/arxiv.2010.12421
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification

Abstract: The experimental landscape in natural language processing for social media is too fragmented. Each year, new shared tasks and datasets are proposed, ranging from classics like sentiment analysis to irony detection or emoji prediction. Therefore, it is unclear what the current state of the art is, as there is no standardized evaluation protocol, neither a strong set of baselines trained on such domainspecific data. In this paper, we propose a new evaluation framework (TWEETEVAL) consisting of seven heterogeneou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
70
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
4

Relationship

0
9

Authors

Journals

citations
Cited by 46 publications
(70 citation statements)
references
References 9 publications
0
70
0
Order By: Relevance
“…We use Stanford Sentiment Treebank (SST-2) [28], Text Retrieval Conference (TREC-6) [29], TweetEval [30] and BBC News * datasets for our study. These datasets cover both binary and multi-class classification.…”
Section: Dataset Descriptionmentioning
confidence: 99%
“…We use Stanford Sentiment Treebank (SST-2) [28], Text Retrieval Conference (TREC-6) [29], TweetEval [30] and BBC News * datasets for our study. These datasets cover both binary and multi-class classification.…”
Section: Dataset Descriptionmentioning
confidence: 99%
“…The Python code for the RoBERTa framework that we applied is adapted from Barbieri et al (2020a) and is available at https://huggingface.co/cardiffnlp/twitter-roberta-basesentiment.…”
Section: Embedding Learning Via Robertamentioning
confidence: 99%
“…BERT (Devlin et al, 2018) based models have achieved state of the art performance in many downstream tasks due to their superior contextualized representations of language, providing true bidirectional context to word embeddings. We will use the sentiment analysis model from (Barbieri et al, 2020), trained on a large corpus of English Tweets (60 million Tweets) for initializing our algorithm. We will refer to the sentiment analysis model from (Barbieri et al, 2020) as the TweetEval model in the remainder of the paper.…”
Section: Related Workmentioning
confidence: 99%
“…We will use the sentiment analysis model from (Barbieri et al, 2020), trained on a large corpus of English Tweets (60 million Tweets) for initializing our algorithm. We will refer to the sentiment analysis model from (Barbieri et al, 2020) as the TweetEval model in the remainder of the paper. The TweetEval model is built on top of an English RoBERTa (Liu et al, 2019) model.…”
Section: Related Workmentioning
confidence: 99%