Prasad M. Deshpande scite author profile

Abstract. OLAP applications use precomputation of aggregate data to improve query response time. While this problem has been well-studied in the recent database literature, to our knowledge all previous work has focussed on the special case in which all aggregates are computed from a single cube (in a star schema, this corresponds to there being a single fact table). This is unfortunate, because many real world applications require aggregates over multiple fact tables. In this paper, we attempt to fill this lack of discussion about the issues arising in multi-cube data models by analyzing these issues. Then we examine performance issues by studying the precomputation problem for multi-cube systems. We show that this problem is significantly more complex than the single cube precomputation problem, and that algorithms and cost models developed for single cube precomputation must be extended to deal well with the multi-cube case. Our results from a prototype implementation show that for multi-cube workloads substantial performance improvements can be realized by using the multi-cube algorithms.

show abstract

An array-based algorithm for simultaneous multidimensional aggregates

Zhao

1997

View full text Add to dashboard Cite

Computing multiple related groupbys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the "Cube" operator, which computes group-by aggregations over all possible subsets of the specified dimensions. The rapid acceptance of the importance of this operator has led to a variant of the Cube being proposed for the SQL standard. Several efficient algorithms for Relational OLAP (ROLAP) have been developed to compute the Cube. However, to our knowledge there is nothing in the literature on how to compute the Cube for Multidimensional OLAP (MOLAP) systems, which store their data in sparse arrays rather than in tables. In this paper, we present a MOLAP algorithm to compute the Cube, and compare it to a leading ROLAP afgorithm. The comparison between the two is interesting, since although they are computing the same function, one is value-based (the ROLAP algorithm) whereas the other is position-based (the MOLAP algorithm.) Our tests show that, given appropriate compression techniques, the MOLAP algorithm is significantly faster than the RO-LAP algorithm. In fact, the difference is so pronounced that this MOLAP algorithm may be usefuf for ROLAP systems as wefl as MOLAP systems, since in many cases, instead of cubing a table directly, it is faster to fist convert the table to an array, cube the array, then convert the result back to a table.

show abstract

OLAP over uncertain and imprecise data

Burdick

Deshpande²,

Jayram³

et al. 2006

The VLDB Journal

119

View full text Add to dashboard Cite

Caching multidimensional queries using chunks

et al. 1998

View full text Add to dashboard Cite

Caching has been proposed (and implemented) by OLAP systems in order to reduce response times for multidimensional queries. Previous work on such caching has considered table level caching and query level caching. Table level caching is more suitable for static schemes. On the other hand, query level caching can be used in dynamic schemes, but is too coarse for “large” query results. Query level caching has the further drawback for small query results in that it is only effective when a new query is subsumed by a previously cached query. In this paper, we propose caching small regions of the multidimensional space called “chunks” . Chunk-based caching allows fine granularity caching, and allows queries to partially reuse the results of previous queries with which they overlap. To facilitate the computation of chunks required by a query but missing from the cache, we propose a new organization for relational tables, which we call a “chunked file.” Our experiments show that for workloads that exhibit query locality, chunked caching combined with the chunked file organization performs better than query level caching. An unexpected benefit of the chunked file organization is that, due to its multidimensional clustering properties, it can significantly improve the performance of queries that “miss” the cache entirely as compared to traditional file organizations.

show abstract

Indexing and matching trajectories under inconsistent sampling rates

et al. 2015

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.