The focus of this paper is on investigating efficient hash join algorithms for modern multi-core processors in main memory environments. This paper dissects each internal phase of a typical hash join algorithm and considers different alternatives for implementing each phase, producing a family of hash join algorithms. Then, we implement these main memory algorithms on two radically different modern multiprocessor systems, and carefully examine the factors that impact the performance of each method.Our analysis reveals some interesting results -a very simple hash join algorithm is very competitive to the other more complex methods. This simple join algorithm builds a shared hash table and does not partition the input relations. Its simplicity implies that it requires fewer parameter settings, thereby making it far easier for query optimizers and execution engines to use it in practice. Furthermore, the performance of this simple algorithm improves dramatically as the skew in the input data increases, and it quickly starts to outperform all other algorithms. Based on our results, we propose that database implementers consider adding this simple join algorithm to their repertoire of main memory join algorithms, or adapt their methods to mimic the strategy employed by this algorithm, especially when joining inputs with skewed data distributions.
Single-version locking schedulerProving the single-version locking scheme correct is trivial, as the scheduler is a 2PL scheduler. Multi-version pessimistic (locking) schedulerThe multi-version pessimistic (locking) scheme is in fact a MV2PL scheduler. Holding a certify (commit) lock on a data item in MV2PL is exactly like having the NoMoreReadLocks bit set in the latest version of the data item in our implementation (see Section 4.2.1). Section 5.5.2 of [WV02] describes MV2PL in detail and proves it only admits 1SR multi-version histories. Multi-version optimistic schedulerLet us now prove that the multi-version optimistic scheduler only admits 1SR multi-version histories. We use the notation and theorems from Section 5.2 of [BHG87]. The multi-version optimistic scheduler behaves like a MVTO scheduler, with the changes described below.Let transaction Tx be a committed transaction with a Begin timestamp of TxBegin and an End timestamp of TxEnd.Property 1: Timestamps are assigned in a monotonically increasing order, and each transaction has a unique begin and end timestamp, such that TxBegin < TxEnd.Property 2: A given version is valid for the interval specified by the begin and end timestamps. There is a total order << of versions for a given datum, as determined by the timestamp order of the nonoverlapping version validity intervals.Property 3: The transaction Tx reads the latest committed version as of TxRead (where TxBegin <= TxRead < TxEnd) and validates (that is, repeats) the read of the latest committed version as of TxEnd. The transaction fails if the two reads return different versions.Property 4: Updates or deletes to a version V first check the visibility of V. Checking the visibility of V is equivalent to reading V. Therefore, a write is always preceded by a read: if transaction Tx writes Vnew, then transaction Tx has first read Vold, where Vold << Vnew. Moreover, there exists no version V such that Vold << V << Vnew, otherwise Tx would have never committed: it would have failed during the Active phase when changing the end timestamp of Vold (see Section 3.1, paragraph "Update version") 1 .1 Notice that all our concurrency control algorithms enforce a stronger property: they use the first-writer-wins rule to abort transactions that participate in a write-write conflict before it is determined whether the first writer will commit. The more relaxed property described here is sufficient to prove correctness.
Scientific experiments and large-scale simulations produce massive amounts of data. Many of these scientific datasets are arrays, and are stored in file formats such as HDF5 and NetCDF. Although scientific data management systems, such as SciDB, are designed to manipulate arrays, there are challenges in integrating these systems into existing analysis workflows. Major barriers include the expensive task of preparing and loading data before querying, and converting the final results to a format that is understood by the existing post-processing and visualization tools. As a consequence, integrating a data management system into an existing scientific data analysis workflow is time-consuming and requires extensive user involvement.In this paper, we present the design of a new scientific data analysis system that efficiently processes queries directly over data stored in the HDF5 file format. This design choice eliminates the tedious and error-prone data loading process, and makes the query results readily available to the next processing steps of the analysis workflow. Our design leverages the increasing main memory capacities found in supercomputers through bitmap indexing and in-memory query execution. In addition, query processing over the HDF5 data format can be effortlessly parallelized to utilize the ample concurrency available in large-scale supercomputers and modern parallel file systems. We evaluate the performance of our system on a large supercomputing system and experiment with both a synthetic dataset and a real cosmology observation dataset. Our system frequently outperforms the relational database system that the cosmology team currently uses, and is more than 10× faster than Hive when processing data in parallel. Overall, by eliminating the data loading step, our query processing system is more effective in supporting in situ scientific analysis workflows.
No abstract
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.