Data abstraction techniques are widely used in multiresolution visualization systems to reduce visual clutter and facilitate analysis from overview to detail. However, analysts are usually unaware of how well the abstracted data represent the original dataset, which can impact the reliability of results gleaned from the abstractions. In this paper, we define two data abstraction quality measures for computing the degree to which the abstraction conveys the original dataset: the Histogram Difference Measure and the Nearest Neighbor Measure. They have been integrated within XmdvTool, a public-domain multiresolution visualization system for multivariate data analysis that supports sampling as well as clustering to simplify data. Several interactive operations are provided, including adjusting the data abstraction level, changing selected regions, and setting the acceptable data abstraction quality level. Conducting these operations, analysts can select an optimal data abstraction level. Also, analysts can compare different abstraction methods using the measures to see how well relative data density and outliers are maintained, and then select an abstraction method that meets the requirement of their analytic tasks.
The scatterplot matrix is one of the most common methods used to project multivariate data onto two dimensions for display. While each off-diagonal plot maps a pair of non-identical dimensions, there is no prescribed mapping for the diagonal plots. In this paper, histograms, 1D plots and 2D plots are drawn in the diagonal plots of the scatterplots matrix. In 1D plots, the data are assumed to have order, and they are projected in this order. In 2D plots, the data are assumed to have spatial information, and they are projected onto locations based on these spatial attributes using color to represent the dimension value. The plots and the scatterplots are linked together by brushing. Brushing on these alternate visualizations will affect the selected data in the regular scatterplots, and vice versa. Users can also navigate to other visualizations, such as parallel coordinates and glyphs, which are also linked with the scatterplot matrix by brushing. Ordering and spatial attributes can also be used as methods of indexing and organizing data. Users can select an ordering span or a spatial region by interacting with 1D plots or with 2D plots, and then observe the characteristics of the selected data subset. 1D plots and 2D plots provide the ability to explore the ordering and spatial attributes, while other views are for viewing the abstract data. In a sense, we are linking what are traditionally seen as scientific visualization methods with methods from the information visualization and statistical graphics fields. We validate the usefulness of this integration by providing two case studies, time series data analysis and spatial data analysis.
In this work, we describe our approach for making the interactive data exploration system, called XmdvTool, qualityaware to assure informed decision-making. XmdvTool Q makes quality or lack thereof explicit for all stages of the data exploration process from raw data, to abstracted data, to the final visual displays, allowing users to query and navigate through data-, structure-and quality-spaces.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.