Abstract:In the recent big data era, massive spatial related data are continuously generated and scrambled from various sources. Acquiring accurate geographic information is also urgently demanded. How to accurately retrieve desired geographic information has become the prominent issue, needing to be resolved in high priority. The key technologies in geographic information retrieval are modeling document footprints and ranking documents based on their similarity evaluation. The traditional spatial similarity evaluation methods are mainly performed using a MBR (Minimum Bounding Rectangle) footprint model. However, due to its nature of simplification and roughness, the results of traditional methods tend to be isotropic and space-redundant. In this paper, a new model that constructs the footprints in the form of point-sets is presented. The point-set-based footprint coincides the nature of place names in web pages, so it is redundancy-free, consistent, accurate, and anisotropic to describe the spatial extents of documents, and can handle multi-scale geographic information. The corresponding spatial ranking method is also presented based on the point-set-based model. The new similarity evaluation algorithm of this method firstly measures multiple distances for the spatial proximity across different scales, and then combines the frequency of place names to improve the accuracy and precision. The experimental results show that the proposed method outperforms the traditional methods with higher accuracies under different searching scenarios.