2012 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2012
DOI: 10.1109/asonam.2012.29
|View full text |Cite
|
Sign up to set email alerts
|

@Phillies Tweeting from Philly? Predicting Twitter User Locations with Spatial Word Usage

Abstract: Abstract-We study the problem of predicting home locations of Twitter users using contents of their tweet messages. Using three probability models for locations, we compare both the Gaussian Mixture Model (GMM) and the Maximum Likelihood Estimation (MLE). In addition, we propose two novel unsupervised methods based on the notions of Non-Localness and Geometric-Localness to prune noisy data from tweet messages. In the experiments, our unsupervised approach improves the baselines significantly and shows comparab… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0

Year Published

2014
2014
2024
2024

Publication Types

Select...
5
3
2

Relationship

0
10

Authors

Journals

citations
Cited by 73 publications
(30 citation statements)
references
References 12 publications
0
25
0
Order By: Relevance
“…This unfortunately requires a large hand-annotated corpus for training. Han14 systematically investigate various feature selection methods for finding geo-indicative words, such as information gain ratio (IGR) (Quinlan, 1993), Ripley's K statistic (O'Sullivan and Unwin, 2010) and geographic density (Chang et al, 2012), showing significant improvements on TWUS and TWWORLD ( §2).…”
Section: Feature Selectionmentioning
confidence: 99%
“…This unfortunately requires a large hand-annotated corpus for training. Han14 systematically investigate various feature selection methods for finding geo-indicative words, such as information gain ratio (IGR) (Quinlan, 1993), Ripley's K statistic (O'Sullivan and Unwin, 2010) and geographic density (Chang et al, 2012), showing significant improvements on TWUS and TWWORLD ( §2).…”
Section: Feature Selectionmentioning
confidence: 99%
“…Feature Selection. Dimensionality reduction methods have been used to improve geolocation inference performance while reducing computational cost (Cheng et al, 2010;Chang et al, 2012;Han et al, 2012). Of existing approaches, the…”
Section: Resultsmentioning
confidence: 99%
“…Another study [54] estimated a city-level user location based purely on the content of tweets, which might include reply tweet information, without the use of any external information, such as a gazetteer or internet protocol (IP) information. Two unsupervised methods [55] have been proposed based on notions of nonlocalness and geometric localness to prune noisy data from tweets. One report [56] described language models of locations using coordinates extracted from geotagged Twitter data.…”
Section: Discussionmentioning
confidence: 99%