Modeling and recognizing landmarks at world-scale is a useful yet challenging task. There exists no readily available list of worldwide landmarks. Obtaining reliable visual models for each landmark can also pose problems, and efficiency is another challenge for such a large scale system. This paper leverages the vast amount of multimedia data on the web, the availability of an Internet image search engine, and advances in object recognition and clustering techniques, to address these issues. First, a comprehensive list of landmarks is mined from two sources: (1) ∼20 million GPS-tagged photos and (2) online tour guide web pages. Candidate images for each landmark are then obtained from photo sharing websites or by querying an image search engine. Second, landmark visual models are built by pruning candidate images using efficient image matching and unsupervised clustering techniques. Finally, the landmarks and their visual models are validated by checking authorship of their member images. The resulting landmark recognition engine incorporates 5312 landmarks from 1259 cities in 144 countries. The experiments demonstrate that the engine can deliver satisfactory recognition performance with high efficiency.
Recently, the phenomenal advent of photo-sharing services, such as Flickr and Panoramio, have led to volumous community-contributed photos with text tags, timestamps, and geographic references on the Internet. The photos, together with their time- and geo-references, become the digital footprints of photo takers and implicitly document their spatiotemporal movements. This study aims to leverage the wealth of these enriched online photos to analyze people’s travel patterns at the local level of a tour destination. Specifically, we focus our analysis on two aspects: (1) tourist movement patterns in relation to the regions of attractions (RoA), and (2) topological characteristics of travel routes by different tourists. To do so, we first build a statistically reliable database of travel paths from a noisy pool of community-contributed geotagged photos on the Internet. We then investigate the tourist traffic flow among different RoAs by exploiting the Markov chain model. Finally, the topological characteristics of travel routes are analyzed by performing a sequence clustering on tour routes. Testings on four major cities demonstrate promising results of the proposed system.
This paper introduces a web image dataset created by NUS's Lab for Media Search. The dataset includes: (1) 269,648 images and the associated tags from Flickr, with a total of 5,018 unique tags; (2) six types of low-level features extracted from these images, including 64-D color histogram, 144-D color correlogram, 73-D edge direction histogram, 128-D wavelet texture, 225-D block-wise color moments extracted over 5×5 fixed grid partitions, and 500-D bag of words based on SIFT descriptions; and (3) ground-truth for 81 concepts that can be used for evaluation. Based on this dataset, we highlight characteristics of Web image collections and identify four research issues on web image annotation and retrieval. We also provide the baseline results for web image annotation by learning from the tags using the traditional k -NN algorithm. The benchmark results indicate that it is possible to learn effective models from sufficiently large image dataset to facilitate general image retrieval.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.