Proceedings of the Thirtieth ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems 2011
DOI: 10.1145/1989284.1989303
|View full text |Cite
|
Sign up to set email alerts
|

On provenance minimization

Abstract: Provenance information has been proved to be very effective in capturing the computational process performed by queries, and has been used extensively as the input to many advanced data management tools (e.g. view maintenance, trust assessment, or query answering in probabilistic databases). We study here the core of provenance information, namely the part of provenance that appears in the computation of every query equivalent to the given one. This provenance core is informative as it describes the part of th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2012
2012
2019
2019

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 10 publications
(10 citation statements)
references
References 28 publications
0
10
0
Order By: Relevance
“…For example, consider the interval encoding (Figure 9(b)) for the example network shown in Figure 9(a). The provenance of tuple 3:1 is represented as a single interval [1,4], because the TID-Set forms a single contiguous sequence of TIDs (1 to 4). Interval encoding is most advantageous for queries involving aggregations over long sequences of contiguous TIDs, but introduces overhead if such sequences do not occur -both the start and end TID of an interval need to be stored.…”
Section: Provenance Compressionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, consider the interval encoding (Figure 9(b)) for the example network shown in Figure 9(a). The provenance of tuple 3:1 is represented as a single interval [1,4], because the TID-Set forms a single contiguous sequence of TIDs (1 to 4). Interval encoding is most advantageous for queries involving aggregations over long sequences of contiguous TIDs, but introduces overhead if such sequences do not occur -both the start and end TID of an interval need to be stored.…”
Section: Provenance Compressionmentioning
confidence: 99%
“…Consider the covering intervals shown in Figure 9(b). The TID-Set for tuple 3:1 is covered by the interval [1,4]. This covering interval is stored in two additional fields at the end of the data tuple 3:1 .…”
Section: Lazy Generation and Retrievalmentioning
confidence: 99%
“…2 We call such data K-annotated unordered XML, or simply K-UXML. Given a domain L of labels, the usual mutually recursive definition of XML data naturally generalizes to K-UXML: 3…”
Section: Annotated Xmlmentioning
confidence: 99%
“…In particular, Amsterdamer et al [2] studied provenance minimization, and defined the core of provenance information, namely the part of provenance that appears independently of the query plan that is in use. 6 They also provided algorithms that, given a query, compute an equivalent one that realizes the core provenance for all tuples in its result, as well as algorithms to compute the core provenance of a particular tuple in a query result without re-evaluating the query.…”
Section: Minimization and Factorizationmentioning
confidence: 99%
See 1 more Smart Citation