We survey popular data sets used in computer vision literature and point out their limitations for mobile visual search applications. To overcome many of the limitations, we propose the Stanford Mobile Visual Search data set. The data set contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum paintings and video clips. The data set has several key characteristics lacking in existing data sets: rigid objects, widely varying lighting conditions, perspective distortion, foreground and background clutter, realistic ground-truth reference data, and query data collected from heterogeneous low and high-end camera phones. We hope that the data set will help push research forward in the field of mobile visual search.
Mobile phones are an attractive platform for landmark-based pedestrian navigation systems. To be practical, such a system must be able to automatically generate lightweight directions that can be displayed on these mobile devices. We present a system that leverages an online collection of geotagged photographs to automatically generate navigational instructions. These are presented to the user as a sequence of images of landmarks augmented with directional instructions. Both the landmark selection and image augmentation are done automatically. We present a user study that indicates these generated directions are beneficial to users and suggest areas for future improvement.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.