Rewriting queries with arbitrary aggregation functions using views

Cohen, Sara; Nutt, Werner; Sagiv, Yehoshua

doi:10.1145/1138394.1138400

Cited by 34 publications

(27 citation statements)

References 30 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The probability of a query match can be computed in polynomial time for TP. Note that for relational databases, computational problems for aggregate queries are usually more difficult than for non-aggregate queries (Cohen et al, 2006). This situation also holds for PrXML System Issues.…”

Section: Example 6 the Evaluation Of The Query Q Countdmentioning

confidence: 99%

Modeling, Querying, and Mining Uncertain XML Data

Kharlamov

Senellart

2013

Data Mining

View full text Add to dashboard Cite

This chapter deals with data mining in uncertain XML data models, this uncertainty typically coming from imprecise automatic processes. We first review the literature on modeling uncertain data, starting with well-studied relational models and moving then to their semistructured counterparts. We focus on a specific probabilistic XML model, that allows representing arbitrary finite distributions of XML documents, and has been extended to also allow continuous distributions of data values. We summarize previous work on querying this uncertain data model and show how to apply the corresponding techniques to several data mining tasks, exemplified through use cases on two running examples.

show abstract

Section: Example 6 the Evaluation Of The Query Q Countdmentioning

confidence: 99%

Modeling, Querying, and Mining Uncertain XML Data

Kharlamov

Senellart

2013

Data Mining

View full text Add to dashboard Cite

show abstract

“…The problem of reformulating queries using views for query optimization or database design dates back to equivalent conjunctive queries in relational data model [17,5], and were expanded to encompass recursive queries [13] or even queries with aggregation functions [10] and later revisited in semi-structural data model with regular path queries [16]. In addition to theoretical interests on complexity [2], minimization [8] of reformulation of queries and other constraints [19] in the presence of materialized views, practical and scalable solutions to exploiting views are intensively studied, including [6].…”

Section: Related Workmentioning

confidence: 99%

RDF pattern matching using sortable views

Chong

Chen

Zhang

et al. 2012

Proceedings of the 21st ACM International Conference on Information and Knowledge Management

View full text Add to dashboard Cite

In the last few years, RDF is becoming the dominating data model used in semantic web for knowledge representation and inference. In this paper, we revisit the problem of pattern matching query in RDF model, which is usually expensive in efficiency due to the huge cost on join operations. To alleviate the efficiency pain, view materialization techniques are usually deployed to accelerate the query processing. However, given an arbitrary view, it remains difficult to identify how to reuse the view for a particular query, because of the NP-hardness behind the algorithm matching patterns and views. To fully exploit the benefit of the materialized views, we propose a new paradigm to enhance the effectiveness of the materialized view. Instead of choosing materialized views in arbitrary form, our paradigm aims to select the views only if they are sortable. The property of sortability raises huge gains on the pattern-view matching, bringing down the cost to linear complexity in terms of the pattern size. On the other side, the costs on identifying sortable views and searching over the views using inverted index are affordable. Moreover, sortable views generally improve the overall performance of pattern matching, by means of a cost model used to optimize the query rewriting on the most appropriate views. Finally, we demonstrate extensive experimental results to verify the superiority of our proposal on both efficiency and effectiveness.

show abstract

“…We have compared and contrasted our results with classic minimization for all of these classes, except that we restricted the discussion to queries with disequalities (̸ =) instead of general inequalities (<, ≤, ...). Query inclusion and minimization were further studied for queries of various classes, for instance, aggregation, bag semantics and arithmetic comparisons [13,14,2,8]; identifying the core provenance for queries of these classes, and for queries with general inequalities (<, ≤, ...), is an interesting future work. Practically efficient heuristics (e.g.…”

Section: Related Workmentioning

confidence: 99%

On provenance minimization

Amsterdamer

Deutch

Milo

et al. 2011

Proceedings of the Thirtieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems

View full text Add to dashboard Cite

Provenance information has been proved to be very effective in capturing the computational process performed by queries, and has been used extensively as the input to many advanced data management tools (e.g. view maintenance, trust assessment, or query answering in probabilistic databases). We study here the core of provenance information, namely the part of provenance that appears in the computation of every query equivalent to the given one. This provenance core is informative as it describes the part of the computational process that is inherent to the query. It is also useful as a compact input to the above mentioned data management tools. We study algorithms that, given a query, compute an equivalent query that realizes the core provenance for all tuples in its result. We study these algorithms for queries of varying expressive power. Finally, we observe that, in general, one would not want to require database systems to evaluate a specific query that realizes the core provenance, but instead to be able to find, possibly off-line, the core provenance of a given tuple in the output (computed by an arbitrary equivalent query), without rewriting the query. We provide algorithms for such direct computation of the core provenance. ABSTRACTProvenance information has been proved to be very effective in capturing the computational process performed by queries, and has been used extensively as the input to many advanced data management tools (e.g. view maintenance, trust assessment, or query answering in probabilistic databases). We study here the core of provenance information, namely the part of provenance that appears in the computation of every query equivalent to the given one. This provenance core is informative as it describes the part of the computational process that is inherent to the query. It is also useful as a compact input to the above mentioned data management tools. We study algorithms that, given a query, compute an equivalent query that realizes the core provenance for all tuples in its result. We study these algorithms for queries of varying expressive power. Finally, we observe that, in general, one would not want to require database systems to evaluate a specific query that realizes the core provenance, but instead to be able to find, possibly off-line, the core provenance of a given tuple in the output (computed by an arbitrary equivalent query), without rewriting the query. We provide algorithms for such direct computation of the core provenance.

show abstract

Rewriting queries with arbitrary aggregation functions using views

Cited by 34 publications

References 30 publications

Modeling, Querying, and Mining Uncertain XML Data

Modeling, Querying, and Mining Uncertain XML Data

RDF pattern matching using sortable views

On provenance minimization

Contact Info

Product

Resources

About