Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2019
DOI: 10.1145/3341161.3342870
|View full text |Cite
|
Sign up to set email alerts
|

A large-scale empirical study of geotagging behavior on Twitter

Abstract: Geotagging on social media has become an important proxy for understanding people's mobility and social events. Research that uses geotags to infer public opinions relies on several key assumptions about the behavior of geotagged and non-geotagged users. However, these assumptions have not been fully validated. Lack of understanding the geotagging behavior prohibits people further utilizing it. In this paper, we present an empirical study of geotagging behavior on Twitter based on more than 40 billion tweets c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
37
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 41 publications
(40 citation statements)
references
References 27 publications
1
37
0
Order By: Relevance
“…In order to achieve this aim, the tweet terms are compared with the words contained in a set of dictionaries: if the tweet term is found in one of the used dictionaries, it is removed. The used dictionaries are: 1) the dictionary of the language of the tweet 3 ; 2) the English dictionary 4 because the use of anglicisms is a common routine; 3) a dictionary of the Internet slang 5 containing the most common abbreviations used in the Internet. The tweet terms that are not filtered are considered to be names for candidate locations.…”
Section: F Candidate Location Computationmentioning
confidence: 99%
See 2 more Smart Citations
“…In order to achieve this aim, the tweet terms are compared with the words contained in a set of dictionaries: if the tweet term is found in one of the used dictionaries, it is removed. The used dictionaries are: 1) the dictionary of the language of the tweet 3 ; 2) the English dictionary 4 because the use of anglicisms is a common routine; 3) a dictionary of the Internet slang 5 containing the most common abbreviations used in the Internet. The tweet terms that are not filtered are considered to be names for candidate locations.…”
Section: F Candidate Location Computationmentioning
confidence: 99%
“…Unfortunately, identifying the location where events are taking place is one of the biggest challenges in this new field of research. The complexity of event geo-localization is related to a set of factors which include: a) not all tweets written during the discussion of an event contain information about the location of that event; b) only a very small part of the tweets (about 2% of the posted tweets [5]) are geo-located, indeed, not necessarily a user in describing an event must also indicate the place where it occurred; c) the information contained in the tweets may be inconsistent (e.g., inaccurate, badly written, ambiguous). Because of its underlying complexity, in the literature, only few efforts, characterized by crucial limitations, have been carried out to address the geo-localization problem.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…These pioneering studies often used field work data from a handful of individuals and focused on small sets of carefully chosen features, often phonological. Inspired by this early work, researchers have used geographically tagged social media data from hundreds of thousands of users to predict user location (Paul and Dredze, 2011;Amitay et al, 2004;Han et al, 2014;Rahimi et al, 2017;Huang and Carley, 2019b;Tian et al, 2020;Zhong et al, 2020) or to develop language identification tools (Lui and Baldwin, 2012;Zubiaga et al, 2016;Jurgens et al, 2017a;Dunn and Adams, 2020). Whether it is possible at all to predict the micro-varieties 2 of the same general 1 Our labeled data and models will be available at: https: //github.com/UBC-NLP/microdialects.…”
Section: Introductionmentioning
confidence: 99%
“…Despite these advantages, one major issue that hinders the usability of Twitter data is the lack of geolocation information. Only 1-3% of tweets has GPS-coordinates (Huang and Carley, 2019). Response authorities and humanitarian organizations heavily rely on geolocation information for both situational awareness and response tasks.…”
Section: Introductionmentioning
confidence: 99%