2010 IEEE Fourth International Conference on Semantic Computing 2010
DOI: 10.1109/icsc.2010.74
|View full text |Cite
|
Sign up to set email alerts
|

A Comparison of Approaches for Geospatial Entity Extraction from Wikipedia

Abstract: We target in this paper the challenge of extracting geospatial data from the article text of the English Wikipedia. We present the results of a Hidden Markov Model (HMM) based approach to identify location-related named entities in the our corpus of Wikipedia articles, which are primarily about battles and wars due to their high geospatial content. The HMM NER process drives a geocoding and resolution process, whose goal is to determine the correct coordinates for each place name (often referred to as groundin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
5
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 10 publications
0
5
0
Order By: Relevance
“…Word2vec's dimension is 300. Hidden Markov Model (HMM) [21], CRF [23], IDCNN-CRF [29] and BLSTM-CRF [30] are used as baseline models of BERT-WWM+BLSTM-CRF urban data recognition model.…”
Section: Results and Analysis 1) Comparative Experiments Of Urban Dmentioning
confidence: 99%
See 1 more Smart Citation
“…Word2vec's dimension is 300. Hidden Markov Model (HMM) [21], CRF [23], IDCNN-CRF [29] and BLSTM-CRF [30] are used as baseline models of BERT-WWM+BLSTM-CRF urban data recognition model.…”
Section: Results and Analysis 1) Comparative Experiments Of Urban Dmentioning
confidence: 99%
“…The selection of data source and the construction of data extraction model are two key issues. Most of the work for extracting urban data from Internet resources (e.g., [12], [21]) usually sets one or more specific websites as the stable data source. However, the use of specific websites as data sources limits the diversity of data collection, making it difficult to obtain data with available comprehensiveness.…”
Section: Data Extraction and Web Clusteringmentioning
confidence: 99%
“…Our goal was to see if an especially trained and tuned SVM will perform better than HMM or CRF approaches, particularly for geospatial names. More details can be found in [76,75,77,91].…”
Section: Experiments In Geospatial Named Entity Extractionmentioning
confidence: 99%
“…This research is complemented by work on place name entity recognition and extraction [5,15] and more recent efforts to extend geoparsing methods to microblog and other more cryptic social media posts [3]. Nowadays a number of geoparsers and related software tools and services exist for identifying and geolocating places in different types of unstructured text ( [4], see also [2] for a more exhaustive list), and in microblog or Twitter messages in particular [3,7].…”
Section: Introductionmentioning
confidence: 99%