Microblogs allow users to publish geo-tagged posts-short textual messages assigned to a geographic location. Users send posts from places they visit and discuss an idiosyncratic mixture of personal and general topics. Thus, it is reasonable to assume that the locations and the textual content of posts will be unique and will identify the posting user, to some extent. This raises the question whether there is a correlation between the locations of posts and their content. Are users who are similar from the geospatial perspective (i.e., who send messages from nearby locations) also similar from the textual perspective (i.e., send messages with similar textual content)? Do posts with similar content have a spatial distribution similar to that of any random set of posts? We present a study that focuses on these questions. We provide statistical tests to examine the correlation between textual content and geospatial locations in tweets. We show that although there is some correlation between locations and textual content, they provide different similarity measures, and combining these two properties for identification of users by their posts outperforms methods that merely use locations or only use the textual content, for identification.
Geographic search-where the user provides keywords and receives relevant locations depicted on a map-is a popular web application. Typically, such search is based on static geographic data. However, the abundant geotagged posts in microblogs such as Twitter and in social networks like Instagram, provide contemporary information that can be used to support geosocial search-geographic search based on user activities in social media. Such search can point out where people talk (or tweet) about different topics. For example, the search results may show where people refer to "jogging", to indicate popular jogging places. The difficulty in implementing such search is that there is no natural partition of the space into "documents" as in ordinary web search. Thus, it is not always clear how to present results and how to rank and filter results effectively. In this paper, we demonstrate a two-step process of first, quickly finding the relevant areas by using an arbitrary indexed partition of the space, and secondly, applying clustering on discovered areas, to present more accurate results. We introduce a system that utilizes geotagged posts in geographic search and illustrate how different ranking methods can be used, based on the proposed two-step search process. The system demonstrates the effectiveness and usefulness of the approach.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.