Database indexes are a core technique to speed up data retrieval in any kind of data processing system. However, in the presence of schemas with many attributes it becomes infeasible to create indexes for all columns, as maintenance costs and space requirements are simply too high. In these situations, a much more promising approach is to adaptively index the data, i.e. the database gradually partitions (or cracks) those columns that are frequently used in selections. In doing so, the "indexedness" of a table adapts to the requirements of the workload. A large body of work has investigated database cracking, which is a subset of adaptive indexing. Irrespective of their algorithmic behavior, essentially all these works have in common, that the proposed methods use a simple two-sided in-place cracking kernel at the core, which performs a partitioning step. As this partitioning makes a large portion of the total indexing e ort, the choice of the kernel can make a factor of two di erence in the running time for a method sitting on top. To approach the topic, we rst perform an experimental evaluation of existing state-of-the-art kernels and study their respective downsides in detail. Based on the gained insights, we propose both an advanced version of the best existing kernel as well as a new and unconventional approach, which utilizes features of the operating system as well as data parallelism. In our nal evaluation of all kernels, we vary entry size, index layout, selectivity, and number of threads, and provide a decision tree to select the best cracking kernel for the respective situation.
Nowadays, SIMD hardware is omnipresent in computers. Nonetheless, many software projects make hardly use of SIMD instructions: Applications are usually written in general-purpose languages like C++. However, general-purpose languages only provide poor abstractions for SIMD programming enforcing an errorprone, assembly-like programming style. An alternative are dataparallel languages. They indeed offer more convenience to target SIMD architectures but introduce their own set of problems. In particular, programmers are often unwilling to port their working C++ code to a new programming language.In this paper we present Sierra: a SIMD extension for C++. It combines the full power of C++ with an intuitive and effective way to address SIMD hardware. With Sierra, the programmer can write efficient, portable and maintainable code. It is particularly easy to enhance existing code to run efficiently on SIMD machines.In contrast to prior approaches, the programmer has explicit control over the involved vector lengths.
Join order optimization is one of the most fundamental problems in processing queries on relational data. It has been studied extensively for almost four decades now. Still, because of its NP hardness, no generally efficient solution exists and the problem remains an important topic of research. The scope of algorithms to compute join orders ranges from exhaustive enumeration, to combinatorics based on graph properties, to greedy search, to genetic algorithms, to recently investigated machine learning. A few works exist that use heuristic search to compute join orders. However, a theoretical argument why and how heuristic search is applicable to join order optimization is lacking. In this work, we investigate join order optimization via heuristic search. In particular, we provide a strong theoretical framework, in which we reduce join order optimization to the shortest path problem. We then thoroughly analyze the properties of this problem and the applicability of heuristic search. We devise crucial optimizations to make heuristic search tractable. We implement join ordering via heuristic search in a real DBMS and conduct an extensive empirical study. Our findings show that for star- and clique-shaped queries, heuristic search finds optimal plans an order of magnitude faster than current state of the art. Our suboptimal solutions further extend the cost/time Pareto frontier.
Doing sports on a regular basis is beneficial for personal health and well-being. This paper introduces the concept of a mobile app, called OmniSports, which has the goal to assist people already keen on doing sports. It will provide a digital training schedule, which is not only updated automatically in instrumented fitness centers, but also during outdoor exercises which are done as part of digital fitness trails. Within this paper, we present the results of two initially conducted studies: an analysis of current training schedules and an investigation of people's ability to create and use outdoor spots for exercises.
Interpreted execution of queries, as in the vectorized model, suffers from interpretation overheads. By compiling queries this interpretation overhead is eliminated at the cost of a compilation phase that delays execution, sacrificing latency for throughput. For short-lived queries, minimizing latency is important, while for longrunning queries throughput outweighs latency. Because neither a purely interpretive model nor a purely compiling model can provide low latency and high throughput, adaptive solutions emerged. Adaptive systems seamlessly transition from interpreted to compiled execution, achieving low latency for short-lived queries and high throughput for long-running queries. However, these adaptive systems pose an immense development effort and require expert knowledge in both interpreter and compiler design.In this work, we investigate query execution by compilation to WebAssembly. We are able to compile even complex queries in less than a millisecond to machine code with near-optimal performance. By delegating execution of WebAssembly to the V8 engine, we are able to seamlessly transition from rapidly compiled yet nonoptimized code to thoroughly optimized code during execution. Our approach provides both low latency and high throughput, is adaptive out of the box, and is straight forward to implement. The drastically reduced compilation times even enable us to explore generative programming of library code, that is fully inlined by construction. Our experimental evaluation confirms that our approach yields competitive and sometimes superior performance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.