Abstract-This paper proposes LARS, a location-aware recommender system that uses location-based ratings to produce recommendations. Traditional recommender systems do not consider spatial properties of users nor items; LARS, on the other hand, supports a taxonomy of three novel classes of locationbased ratings, namely, spatial ratings for non-spatial items, nonspatial ratings for spatial items, and spatial ratings for spatial items. LARS exploits user rating locations through user partitioning, a technique that influences recommendations with ratings spatially close to querying users in a manner that maximizes system scalability while not sacrificing recommendation quality. LARS exploits item locations using travel penalty, a technique that favors recommendation candidates closer in travel distance to querying users in a way that avoids exhaustive access to all spatial items. LARS can apply these techniques separately, or together, depending on the type of location-based rating available. Experimental evidence using large-scale real-world data from both the Foursquare location-based social network and the MovieLens movie recommendation system reveals that LARS is efficient, scalable, and capable of producing recommendations twice as accurate compared to existing recommendation approaches.
SpatialHadoop is an extended MapReduce framework that supports global indexing that spatial partitions the data across machines providing orders of magnitude speedup, compared to traditional Hadoop. In this paper, we describe seven alternative partitioning techniques and experimentally study their effect on the quality of the generated index and the performance of range and spatial join queries. We found that using a 1% sample is enough to produce high quality partitions. Also, we found that the total area of partitions is a reasonable measure of the quality of indexes when running spatial join. This study will assist researchers in choosing a good spatial partitioning technique in distributed environments. 1. INDEXING IN SPATIALHADOOP SpatialHadoop [2, 3] provides a generic indexing algorithm which was used to implement grid, R-tree, and R+-tree based partitioning. This paper extends our previous study by introducing four new partitioning techniques, Z-curve, Hilbert curve, Quad tree, and K-d tree, and experimentally evaluate all of the seven techniques. The partitioning phase of the indexing algorithm runs in three steps, where the first step is fixed and the last two steps are customized for each partitioning technique. The first step computes number of desired partitions n based on file size and HDFS block capacity which are both fixed for all partitioning techniques. The second step reads a random sample, with a sampling ratio ρ, from the input file and uses this sample to partition the space into n cells such that number of sample points in each cell is at most ⌊k/n⌋, where k is the sample size. The third step actually partitions the file by assigning each record to one or more cells. Boundary objects are handled using either the distribution or replication methods. The distribution method assigns an object to exactly one overlapping cell and the cell has to be expanded to enclose all contained records. The replication method avoids expanding cells by replicating each record to all overlapping cells but the query processor has to employ a duplicate avoidance technique to account for replicated records.
This paper proposes LARS*, a location-aware recommender system that uses location-based ratings to produce recommendations. Traditional recommender systems do not consider spatial properties of users nor items; LARS*, on the other hand, supports a taxonomy of three novel classes of location-based ratings, namely, spatial ratings for non-spatial items, non-spatial ratings for spatial items, and spatial ratings for spatial items. LARS* exploits user rating locations through user partitioning, a technique that influences recommendations with ratings spatially close to querying users in a manner that maximizes system scalability while not sacrificing recommendation quality. LARS* exploits item locations using travel penalty, a technique that favors recommendation candidates closer in travel distance to querying users in a way that avoids exhaustive access to all spatial items. LARS* can apply these techniques separately, or together, depending on the type of location-based rating available. Experimental evidence using large-scale real-world data from both the Foursquare location-based social network and the MovieLens movie recommendation system reveals that LARS* is efficient, scalable, and capable of producing recommendations twice as accurate compared to existing recommendation approaches.
Abstract-Remote sensing data collected by satellites are now made publicly available by several space agencies. This data is very useful for scientists pursuing research in several applications including climate change, desertification, and land use change. The benefit of this data comes from its richness as it provides an archived history for over 15 years of satellite observations for natural phenomena such as temperature and vegetation. Unfortunately, the use of such data is very limited due to the huge size of archives (> 500T B) and the limited capabilities of traditional applications. This paper introduces SHAHED; a MapReduce-based system for querying, visualizing, and mining large scale satellite data. SHAHED considers both the spatial and temporal aspects of the data to provide efficient query processing at large scale. The core of SHAHED is composed of four main components. The uncertainty component recovers missing data in the input which comes from cloud coverage and satellite mis-alignment. The indexing component provides a novel multi-resolution quad-tree-based spatio-temporal index structure, which indexes satellite data efficiently with minimal space overhead. The querying component answers selection and aggregate queries in real-time using the constructed index. Finally, the visualization component uses MapReduce programs to generate heat map images and videos for user queries. A set of experiments running on a live system deployed on a cluster of machines show the efficiency of the proposed design. All the features supported by SHAHED are made accessible through an easy to use web interface that hides the complexity of the system and provides a nice user experience.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.