Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2014
DOI: 10.3115/v1/d14-1039
|View full text |Cite
|
Sign up to set email alerts
|

Hierarchical Discriminative Classification for Text-Based Geolocation

Abstract: Text-based document geolocation is commonly rooted in language-based information retrieval techniques over geodesic grids. These methods ignore the natural hierarchy of cells in such grids and fall afoul of independence assumptions. We demonstrate the effectiveness of using logistic regression models on a hierarchy of nodes in the grid, which improves upon the state of the art accuracy by several percent and reduces mean error distances by hundreds of kilometers on data from Twitter, Wikipedia, and Flickr. We … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
97
1

Year Published

2017
2017
2022
2022

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 92 publications
(98 citation statements)
references
References 38 publications
0
97
1
Order By: Relevance
“…Each region is then considered as a label to train the classifiers. The approach of using k-d tree is also used in Rahimi et al (2015); Han et al (2012) and Wing and Baldridge (2014). See Table 3 for an example of the following methods.…”
Section: Ne Impact On Geolocationmentioning
confidence: 99%
See 1 more Smart Citation
“…Each region is then considered as a label to train the classifiers. The approach of using k-d tree is also used in Rahimi et al (2015); Han et al (2012) and Wing and Baldridge (2014). See Table 3 for an example of the following methods.…”
Section: Ne Impact On Geolocationmentioning
confidence: 99%
“…Some use KL divergence between the distribution of a users words and the words used in each region (Wing and Baldridge, 2011;Roller et al, 2012), regional topic distributions (Eisenstein et al, 2010;Ahmed et al, 2013;Hong et al, 2012), or feature selection/weighting to find words indicative of location (Priedhorsky et al, 2014;Han et al, 2012Han et al, , 2014Wing and Baldridge, 2014).…”
Section: Related Workmentioning
confidence: 99%
“…As mentioned in the previous section, there are several studies for the task. The majority of these studies models the location inference as a multi-class classification problem on the grid over geo-spatial areas or cities on the gazetteer such as GeoNames and DBpedia [12], [16].…”
Section: Pilot Categorization: Why Do We Focus Onmentioning
confidence: 99%
“…In text-based geolocation, researchers have used KL divergence between the distribution of a users words and the words used in geographic regions (Wing and Baldridge, 2011;Roller et al, 2012), regional topic distributions (Eisenstein et al, 2010;Ahmed et al, 2013;Hong et al, 2012), or feature selection/weighting to find words indicative of location (Priedhorsky et al, 2014;Han et al, 2012aHan et al, , 2014Wing and Baldridge, 2014). Han et al (2012b) showed that information gain ratio is a useful metric for measuring how location-indicative words are.…”
Section: Related Workmentioning
confidence: 99%