2020
DOI: 10.48550/arxiv.2004.05861
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ArCOV-19: The First Arabic COVID-19 Twitter Dataset with Propagation Networks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
18
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 15 publications
(18 citation statements)
references
References 0 publications
0
18
0
Order By: Relevance
“…Most of these datasets are generic, and lack annotations or labels. Examples include multilingual corpus on a wide variety of topics related to COVID-19 [CLF20, AMEP + 20, HJB + 20], longitudinal Twitter chatter dataset [BTW + 20], multilingual dataset with location information of the users [QIO20], Twitter dataset for Arabic tweets [AAA20], Twitter dataset for popular Arabic tweets [HHSE20], and dataset for identification of stance, replies, and quotes [VCKBC20]. Most of these datasets either have no annotations at all, employ automated annotations using transfer learning or semi-supervised methods, or are not specifically designed for misinformation.…”
Section: Covid-19 Datasetsmentioning
confidence: 99%
“…Most of these datasets are generic, and lack annotations or labels. Examples include multilingual corpus on a wide variety of topics related to COVID-19 [CLF20, AMEP + 20, HJB + 20], longitudinal Twitter chatter dataset [BTW + 20], multilingual dataset with location information of the users [QIO20], Twitter dataset for Arabic tweets [AAA20], Twitter dataset for popular Arabic tweets [HHSE20], and dataset for identification of stance, replies, and quotes [VCKBC20]. Most of these datasets either have no annotations at all, employ automated annotations using transfer learning or semi-supervised methods, or are not specifically designed for misinformation.…”
Section: Covid-19 Datasetsmentioning
confidence: 99%
“…The code used for data processing is written in Python 3. The code required to hydrate tweets and to use the provided base release files is available on GitHub 10 . Furthermore, we postulate that this large-scale, multilingual, geotagged social media data can empower multidisciplinary research communities to perform longitudinal studies, evaluate how societies are collectively coping with this unprecedented global crisis as well as to develop computational methods to address real-world challenges, including but not limited to the following:…”
Section: Usage Notesmentioning
confidence: 99%
“…Some of the datasets further apply language filters [1,10,20,33] or other requirements such as the availability of location information [19]. Instead of filtering from Twitter streaming data, authors of ArCOV-19 [14] collect tweets returned by the Twitter standard search API 28 when using COVID-19 related keywords (e.g. Corona) as queries and written in Arabic.…”
Section: Related Workmentioning
confidence: 99%
“…Most of the found datasets are being updated regularly. The number of tweets contained in the 13 datasets range from 747,599 [14] to over 524 million [25] by the time of this study, i.e. 20 May, 2020.…”
Section: Related Workmentioning
confidence: 99%