In this paper, a new algorithm for spatial join operations is introduced. The so-called NRQB (No Replication with Quadtrees and Buckets Spatial Merge Join) enhances the original PBSM by partitioning the space according to the spatial distribution of the objects. In addition, a hash file is created for each input data set and used to enhance both the storage of and the access to the minimum bounding rectangles (MBR) of the respective set elements. The paper also presents a performance evaluation of the proposed algorithm relying on the results obtained by the execution of a series of test cases concerning different spatial join scenarios. In each test case, the response time of NRQB is compared with that of some well-known algorithms. The test cases were conducted with both synthetic and real data sets. The results showed that the new algorithm is best suited for smaller buffer sizes, which are typical of mobile devices and database systems for desktop computers.
This paper presents a query optimizer module based on cost estimation that chooses the best filtering step algorithm to perform a specific spatial join operation. A set of expressions to predict the number of I/O operations and the response time of each algorithm is first presented and later refined considering a given hardware configuration. The query optimizer chooses the algorithm that returns the smaller estimated response time. In order to evaluate the query optimizer, we carried out a set of tests with synthetic and real data sets, in a significant number of different scenarios. The query optimizer correctly chooses the fastest algorithm in almost 90% of submitted operations, with minimal overhead.
Abstract. The spatial join operation is both one of the most important and expensive operations in Geographic Database Management Systems (GDBMS). This paper presents a set of rules to optimize the performance of the filtering step of spatial joins operations. First, a set of expressions to predict the number of I/O operations and CPU performance is presented. The rules are based on expressions to predict the performance of algorithms and tests performed with synthetic and real data sets. Fos some cases, the optimized algorithm can execute the same operation 10 times faster than the original, non-optimized version.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.