2021
DOI: 10.3390/ijgi10110779
|View full text |Cite
|
Sign up to set email alerts
|

An End-to-End Point of Interest (POI) Conflation Framework

Abstract: Point of interest (POI) data serves as a valuable source of semantic information for places of interest and has many geospatial applications in real estate, transportation, and urban planning. With the availability of different data sources, POI conflation serves as a valuable technique for enriching data quality and coverage by merging the POI data from multiple sources. This study proposes a novel end-to-end POI conflation framework consisting of six steps, starting with data procurement, schema standardisat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
31
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 31 publications
(31 citation statements)
references
References 41 publications
0
31
0
Order By: Relevance
“…These preprocessing steps generally help remove string differences due to syntactic issues, such as different uses of upper and lower cases or different word orders, and help the similarity measurement focus on the remaining and more meaningful parts of the POI names. Novack et al (2018) and Low et al (2021) used token sort ratio which tokenizes POI names into individual words, sorts them in alphabetical order to form new strings, and then calculates the Levenshtein distance between the new strings. Similar methods, such as token set ratio (based on common tokens in two POI names while ignoring token orders), were also used in Piech et al (2020).…”
Section: Similarity Measures Over Poi Namesmentioning
confidence: 99%
See 2 more Smart Citations
“…These preprocessing steps generally help remove string differences due to syntactic issues, such as different uses of upper and lower cases or different word orders, and help the similarity measurement focus on the remaining and more meaningful parts of the POI names. Novack et al (2018) and Low et al (2021) used token sort ratio which tokenizes POI names into individual words, sorts them in alphabetical order to form new strings, and then calculates the Levenshtein distance between the new strings. Similar methods, such as token set ratio (based on common tokens in two POI names while ignoring token orders), were also used in Piech et al (2020).…”
Section: Similarity Measures Over Poi Namesmentioning
confidence: 99%
“…From time to time, researchers may want to merge two or more POI datasets in order to obtain a better representation of the places in their study areas. This process is generally called POI conflation (Low et al, 2021;McKenzie et al, 2014). A main reason for POI conflation is that different POI datasets may have different attribute focuses, place coverages, and data quality.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Low et al [14] proposed a complete POIs conflation framework, from data catering to data verification. They also used POIs' attributes information to match POIs.…”
Section: Related Workmentioning
confidence: 99%
“…This is performed under the assumption that types' taxonomies from different data sources are consistent and not noisy. In addition, although Low et al [14] claimed large-scale application of their conflation framework, the maximum scale of the used dataset does not exceed 12,000 POIs. The scalability limitation is mainly due to the manual intervention of human experts in the verification step.…”
Section: Related Workmentioning
confidence: 99%