Most recommendation datasets for tourism are restricted to one world region and rely on explicit data such as checkins. However, in reality, tourists visit various places worldwide and document their trips primarily through photos. These images contain a wealth of raw information that can be used to capture users' preferences and recommend personalized content. Visual content was already used in past works, but no large-scale publicly-available dataset that gives access to users' personal images exists for recommender systems. As such a resource would open-up possibilities for new image-based recommendation algorithms, we introduce Vis2Rec, a new dataset based on visit data extracted from users' Flickr photographic streams, which includes over 7 million photos, 36k recognizable points of interest, and 14k user profiles. Google Landmarks v2 is used as an auxiliary dataset to identify points of interest in users' photos, using a state-of-the-art image-matching deep architecture. Image-based user profiles are then constituted by aggregating the points of interest detected for each user. In addition, ground truth visits were determined for the test subset in order to enable accurate evaluation. Finally, we benchmark Vis2Rec using various existing recommender systems, and discuss the possibilities opened up by the availability of user images, as well as the societal issues that come with them. Following good practice in dataset sharing, Vis2Rec is created using only freely distributable content, and additional anonymization is performed to ensure the privacy of users. The raw dataset and the preprocessed user profiles will be publicly available at https://github.com/MSoumm/Vis2Rec.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.