SpatialHadoop is an extended MapReduce framework that supports global indexing that spatial partitions the data across machines providing orders of magnitude speedup, compared to traditional Hadoop. In this paper, we describe seven alternative partitioning techniques and experimentally study their effect on the quality of the generated index and the performance of range and spatial join queries. We found that using a 1% sample is enough to produce high quality partitions. Also, we found that the total area of partitions is a reasonable measure of the quality of indexes when running spatial join. This study will assist researchers in choosing a good spatial partitioning technique in distributed environments. 1. INDEXING IN SPATIALHADOOP SpatialHadoop [2, 3] provides a generic indexing algorithm which was used to implement grid, R-tree, and R+-tree based partitioning. This paper extends our previous study by introducing four new partitioning techniques, Z-curve, Hilbert curve, Quad tree, and K-d tree, and experimentally evaluate all of the seven techniques. The partitioning phase of the indexing algorithm runs in three steps, where the first step is fixed and the last two steps are customized for each partitioning technique. The first step computes number of desired partitions n based on file size and HDFS block capacity which are both fixed for all partitioning techniques. The second step reads a random sample, with a sampling ratio ρ, from the input file and uses this sample to partition the space into n cells such that number of sample points in each cell is at most ⌊k/n⌋, where k is the sample size. The third step actually partitions the file by assigning each record to one or more cells. Boundary objects are handled using either the distribution or replication methods. The distribution method assigns an object to exactly one overlapping cell and the cell has to be expanded to enclose all contained records. The replication method avoids expanding cells by replicating each record to all overlapping cells but the query processor has to employ a duplicate avoidance technique to account for replicated records.
COVID-19 caused the largest economic recession in the history by placing more than one third of world’s population in lockdown. The prolonged restrictions on economic and business activities caused huge economic turmoil that significantly affected the financial markets. To ease the growing pressure on the economy, scientists proposed intermittent lockdowns commonly known as “smart lockdowns”. Under smart lockdown, areas that contain infected clusters of population, namely hotspots, are placed on lockdown, while economic activities are allowed to operate in un-infected areas. In this study, we proposed a novel deep learning prediction framework for the accurate prediction of hotpots. We exploit the benefits of two deep learning models, i.e., Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) and propose a hybrid framework that has the ability to extract multi time-scale features from convolutional layers of CNN. The multi time-scale features are then concatenated and provide as input to 2-layers LSTM model. The LSTM model identifies short, medium and long-term dependencies by learning the representation of time-series data. We perform a series of experiments and compare the proposed framework with other state-of-the-art statistical and machine learning based prediction models. From the experimental results, we demonstrate that the proposed framework beats other existing methods with a clear margin.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.