ABSTRACT:Image geolocalization has become an important research field during the last decade. This field is divided into two main sections. The first is image geolocalization that is used to find out which country, region or city the image belongs to. The second one is refining image localization for uses that require more accuracy such as augmented reality and three dimensional environment reconstruction using images. In this paper we present a processing chain that gathers geographic data from several sources in order to deliver a better geolocalization than the GPS one of an image and precise camera pose parameters. In order to do so, we use multiple types of data. Among this information some are visible in the image and are extracted using image processing, other types of data can be extracted from image file headers or online image sharing platforms related information. Extracted information elements will not be expressive enough if they remain disconnected. We show that grouping these information elements helps finding the best geolocalization of the image.
Cities are in constant change and city managers aim to keep an updated digital model of the city for city governance. There are a lot of images uploaded daily on image sharing platforms (as "Flickr", "Twitter", etc.). These images feature a rough localization and no orientation information. Nevertheless, they can help to populate an active collaborative database of street images usable to maintain a city 3D model, but their localization and orientation need to be known. Based on these images, we propose the Data Gathering system for image Pose Estimation (DGPE) that helps to find the pose (position and orientation) of the camera used to shoot them with better accuracy than the sole GPS localization that may be embedded in the image header. DGPE uses both visual and semantic information, existing in a single image processed by a fully automatic chain composed of three main layers: Data retrieval and preprocessing layer, Features extraction layer, Decision Making layer. In this article, we present the whole system details and compare its detection results with a state of the art method. Finally, we show the obtained localization, and often orientation results, combining both semantic and visual information processing on 47 images. Our multilayer system succeeds in 26% of our test cases in finding a better localization and orientation of the original photo. This is achieved by using only the image content and associated metadata. The use of semantic information found on social media such as comments, hash tags, etc. has doubled the success rate to 59%. It has reduced the search area and thus made the visual search more accurate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.