JavaScript Object Notation (JSON) and its variants have gained great popularity in recent years. Unfortunately, the performance of their analytics is often dragged down by the expensive JSON parsing. To address this, recent work has shown that building bitwise indices on JSON data, called structural indices , can greatly accelerate querying. Despite its promise, the existing structural index construction does not scale well as records become larger and more complex, due to its (inherently) sequential construction process and the involvement of costly memory copies that grow as the nesting level increases. To address the above issues, this work introduces Pison - a more memory-efficient structural index constructor with supports of intra-record parallelism. First, Pison features a redesign of the bottleneck step in the existing solution. The new design is not only simpler but more memory-efficient. More importantly, Pison is able to build structural indices for a single bulky record in parallel, enabled by a group of customized parallelization techniques. Finally, Pison is also optimized for better data locality, which is especially critical in the scenario of bulky record processing. Our evaluation using real-world JSON datasets shows that Pison achieves 9.8X speedup (on average) over the existing structural index construction solution for bulky records and 4.6X speedup (on average) of end-to-end performance (indexing plus querying) over a state-of-the-art SIMD-based JSON parser on a 16-core machine.
Finite state machines (FSMs) are the backbone of many applications, but are difficult to parallelize due to their inherent dependencies. Speculative FSM parallelization has shown promise on multicore machines with up to eight cores. However, as hardware parallelism grows (e.g., Xeon Phi has up to 288 logical cores), a fundamental question raises: How does the speculative FSM parallelization scale as the number of cores increases? Without answering this question, existing methods for speculative FSM parallelization simply choose to use all available cores, which might not only waste computing resources, but also result in suboptimal performance. In this work, we conduct a systematic scalability analysis for speculative FSM parallelization. Unlike many other parallelizations which can be modeled by the classic Amdahl's law or its simple extensions, speculative FSM parallelization scales unconventionally due to the non-deterministic nature of speculation and the cost variations of misspeculation. To address these challenges, this work introduces a spectrum of scalability models that are customized to the properties of specific FSMs and the underlying architecture. The models, for the first time, precisely capture the scalability of speculative parallelization for different FSM computations, and clearly show the existence of a "sweet spot" in terms of the number of cores employed by the speculative FSM parallelization to achieve the optimal performance. To make the scalability models practical, we develop S3, a scalability-sensitive speculation framework for FSM parallelization. For any given FSM, S3 can automatically characterize its properties and analyze its scalability, hence guide speculative parallelization towards the optimal performance and more efficient use of computing resources. Evaluations on different FSMs and architectures confirm the accuracy of the proposed models and show that S3 achieves significant speedup (up to 5X) and energy savings (up to 77%). CCS CONCEPTS • Computing methodologies → Parallel algorithms; • Computer systems organization → Parallel architectures;
We consider our paper's artifact to be the benchmarks we used in the paper, as well as the results we got by running BoostFSM to enable scalable FSM parallelization.We have provided a zip file about the simplified version of our implementations, for download and evaluation, but we need to use a KNL architecture with 64 cores for performance measuremetn, so reviewers are also encouraged to contact us for remote access.In this artifact, we will just prove part of the results shown in the paper (because we want to keep the total evaluation time under 4 hours and our framework evolves over time). This hopefully suffices to validate the claims made in the paper. For any bugs, comments, or feedback, please do not hesitate to contact us.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.