2018
DOI: 10.1145/3178112
|View full text |Cite
|
Sign up to set email alerts
|

Twitter Geolocation

Abstract: Geotagging Twitter messages is an important tool for event detection and enrichment. Despite the availability of both social media content and user network information, these two features are generally utilized separately in the methodology. In this article, we create a hybrid method that uses Twitter content and network information jointly as model features. We use Gaussian mixture models to map the raw spatial distribution of the model features to a predicted field. This approach is scalable to large dataset… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
12
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 35 publications
(13 citation statements)
references
References 23 publications
1
12
0
Order By: Relevance
“…This study was exploratory in nature and collected social media messages for which latitude and longitude coordinates could be collected from the Twitter API, but this data collection methodology is limited to collecting messages from Twitter users that enabled geolocation, a specific limitation to generating a more generalizable dataset on Twitter as it is estimated that only 1% of all tweets are geocoded (48,49). Hence, the dataset used in this study after filtering for keywords was small and likely biased, limiting the generalizability of results.…”
Section: Limitationsmentioning
confidence: 99%
“…This study was exploratory in nature and collected social media messages for which latitude and longitude coordinates could be collected from the Twitter API, but this data collection methodology is limited to collecting messages from Twitter users that enabled geolocation, a specific limitation to generating a more generalizable dataset on Twitter as it is estimated that only 1% of all tweets are geocoded (48,49). Hence, the dataset used in this study after filtering for keywords was small and likely biased, limiting the generalizability of results.…”
Section: Limitationsmentioning
confidence: 99%
“…Moreover, few users use geotagging (e.g., due to privacy issues) (Schulz et al, 2013), which reduces greatly the impact of supervised learning WD methods. Some recent studies adopt hybrid approaches, combining: WD and FN (e.g, (Rahimi et al, 2015;Bakerman et al, 2018)); WD and features derived from user account Metadata (MD) (Ozdikis et al, 2016;Dredze et al, 2013;Schulz et al, 2013); and WD, FN and MD (Williams et al, 2017).…”
Section: Related Workmentioning
confidence: 99%
“…In contrast with several state-of-the-art works (e.g., Celik & Dokuz, 2018;Do et al, 2018;Huang & Carley, 2019;Paule et al, 2019;Ozdikis et al, 2019), a pure unsupervised WD approach is adopted and thus no geographic labeled data (e.g., tweets or user location profiles) is required, only historical tweet nouns and GT data. Moreover, we do not use LIW, as adopted in (Ozdikis et al, 2016;Williams et al, 2017;Bakerman et al, 2018;Shahraki et al, 2019), since LIW often assumes finite and rather static set of locations, typically associated to small world regions. Instead, we use tweet nouns, which can be dynamically updated and that can refer to geographic words and also other terms with a location context (e.g., events, people or organizations).…”
Section: Most Research Work That Focus On Geographic Coordinates (Gcmentioning
confidence: 99%
“…Hence, it is possible that not all non-English COVID-19 selling posts were detected, though non-English signal posts may contain the features subject to classification. The presence of non-English language posts and characters likely indicates that signal posts targeted non-US audiences and social media users, even though the majority of users on both of these platforms are located in the United States [29][30][31]. However, determining more precise geolocation of users was difficult as only 87 tweets and 134 Instagram posts had geotagged information available.…”
Section: Covid-19 Seller Metadata Characteristicsmentioning
confidence: 99%