2016
DOI: 10.3390/ijgi5050065
|View full text |Cite
|
Sign up to set email alerts
|

Using an Optimized Chinese Address Matching Method to Develop a Geocoding Service: A Case Study of Shenzhen, China

Abstract: Abstract:With the coming era of big data and the rapid development and widespread applications of Geographical Information Systems (GISs), geocoding technology is playing an increasingly important role in bridging the gap between non-spatial data resources and spatial data in various fields. However, Chinese geocoding faces great challenges because of the complexity of the address string format in Chinese, which contains no delimiters between Chinese words, and the poor address management resulting from the ex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 45 publications
(31 citation statements)
references
References 27 publications
0
31
0
Order By: Relevance
“…Despite availability of commercial and no-cost geocoding strategies ( Goldstein et al, 2014 ; Faure et al, 2017 ), the automated geocoding of textual documents faces challenges, especially for development of language modeling methods for textual document geocoding ( Faure et al, 2017 ). The complexity of the address string format in Chinese text-based geocoding compounds these challenges ( Tian et al, 2016 ).…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Despite availability of commercial and no-cost geocoding strategies ( Goldstein et al, 2014 ; Faure et al, 2017 ), the automated geocoding of textual documents faces challenges, especially for development of language modeling methods for textual document geocoding ( Faure et al, 2017 ). The complexity of the address string format in Chinese text-based geocoding compounds these challenges ( Tian et al, 2016 ).…”
Section: Discussionmentioning
confidence: 99%
“…Because of its great importance, many geocoding methods have been developed including online services, commercial in-house services, as well as no-cost strategies using R ( Goldstein et al, 2014 ; Faure et al, 2017 ). However, Chinese geocoding faces great challenges due to the complexity of the address string format in Chinese, which contains no delimiters between Chinese words, and limited address reference resources ( Tian et al, 2016 ).…”
Section: Introductionmentioning
confidence: 99%
“…In rule-based approaches, first, a predefined address model consisting of address model features of different types and the relationships between them is established. Then, the string of textual address is segmented by a maximum matching procedure [15][16][17][18]. In statistics-based approaches, pre-statistical variants, such as character frequency or cooccurrence probability, and NLP models, such as the N-gram model, hidden Markov model (HMM), conditional random field (CRF) or branch entropy, are used to segment the string of text into multiple words [19][20][21][22].…”
Section: Related Workmentioning
confidence: 99%
“…We illustrate this compound framework in Fig. 1: Given a set of addresses, we first generate a number of candidate address pairs for matching based on some simple heuristic rules as introduced in [19]. For instance, only addresses in the same city and district (if any), sharing at least one word (after removing stop words) in the left part of their address strings need to be compared.…”
Section: A Compound Framework For Address Matchingmentioning
confidence: 99%