As scientific instruments and computer simulations produce more and more data, the task of locating the essential information to gain insight becomes increasingly difficult. FastBit is an efficient software tool to address this challenge. In this article, we present a summary of the key techniques, namely bitmap compression, encoding and binning. The advances in these techniques have led to a search tool that can answer structured (SQL) queries orders of magnitude faster than popular database systems. To illustrate how FastBit is used in applications, we present three examples involving a high-energy physics experiment, a combustion simulation, and an accelerator simulation. In each case, FastBit significantly reduces the response time and enables interactive exploration on terabytes of data.
We present a practical and general-purpose approach to large and complex visual data analysis where visualization processing, rendering and subsequent human interpretation is constrained to the subset of data deemed interesting by the user. In many scientific data analysis applications, "interesting" data can be defined by compound Boolean range queries of the form (temperature > 1000) AND (70 < pressure < 90). As data sizes grow larger, a central challenge is to answer such queries as efficiently as possible. Prior work in the visualization community has focused on answering range queries for scalar fields within the context of accelerating the search phase of isosurface algorithms. In contrast, our work describes an approach that leverages state-of-the-art indexing technology from the scientific data management community called "bitmap indexing." Our implementation, which we call "DEX" (short for dextrous data explorer), uses bitmap indexing to efficiently answer multivariate, multidimensional data queries to provide input to a visualization pipeline. We present an analysis overview and benchmark results that show bitmap indexing offers significant storage and performance improvements when compared to previous approaches for accelerating the search phase of isosurface algorithms. More importantly, since bitmap indexing supports complex multidimensional, multivariate range queries, it is more generally applicable to scientific data visualization and analysis problems. In addition to benchmark performance and analysis, we apply DEX to a typical scientific visualization problem encountered in combustion simulation data analysis.
Abstract-With the computing industry trending towards multi-and many-core processors, we study how a standard visualization algorithm, ray-casting volume rendering, can benefit from a hybrid parallelism approach. Hybrid parallelism provides the best of both worlds: using distributed-memory parallelism across a large numbers of nodes increases available FLOPs and memory, while exploiting shared-memory parallelism among the cores within each node ensures that each node performs its portion of the larger calculation as efficiently as possible. We demonstrate results from weak and strong scaling studies, at levels of concurrency ranging up to 216,000, and with datasets as large as 12.2 trillion cells. The greatest benefit from hybrid parallelism lies in the communication portion of the algorithm, the dominant cost at higher levels of concurrency. We show that reducing the number of participants with a hybrid approach significantly improves performance.
Abstract-One potential solution to reduce the concentration of carbon dioxide in the atmosphere is the geologic storage of captured CO 2 in underground rock formations, also known as carbon sequestration. There is ongoing research to guarantee that this process is both efficient and safe. We describe tools that provide measurements of media porosity, and permeability estimates, including visualization of pore structures. Existing standard algorithms make limited use of geometric information in calculating permeability of complex microstructures. This quantity is important for the analysis of biomineralization, a subsurface process that can affect physical properties of porous media. This paper introduces geometric and topological descriptors that enhance the estimation of material permeability. Our analysis framework includes the processing of experimental data, segmentation, and feature extraction and making novel use of multiscale topological analysis to quantify maximum flow through porous networks. We illustrate our results using synchrotron-based X-ray computed microtomography of glass beads during biomineralization. We also benchmark the proposed algorithms using simulated data sets modeling jammed packed bead beds of a monodispersive material.
There are some important and motivating questions that drive the research for processing massive data sets, like will it be possible to use the simpler pure parallelism technique to process tomorrow's data? Can pure parallelism scale sufficiently to process massive data sets?To answer these questions, the researchers performed a series of experiments, originally published in IEEE Computer Graphics and Applications [2] and forming the basis of this report, that studied the scalability of pure parallelism in visualization software on massive data sets. These experiments utilized multiple visualization algorithms and were run on multiple architectures. There were two types of experiments performed. The first experiment examined performance at a massive scale: 16,000 or more cores and one trillion or more cells. The second experiment studied whether the approach can maintain a fixed amount of time to complete an operation when the data size is doubled and the amount of resources is doubled, also known as weak scalability. At the time of their original publication, these experiments represented the largest data set sizes ever published in visualization literature. Further, their findings still continue to contribute to the understanding of today's dominant processing paradigm (pure parallelism) on tomorrow's data, in the form of scaling characteristics and bottlenecks at high levels of concurrency and with very large data sets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.