Provenance for transactional updates is critical for many applications such
as auditing and debugging of transactions. Recently, we have introduced
MV-semirings, an extension of the semiring provenance model that supports
updates and transactions. Furthermore, we have proposed reenactment, a
declarative form of replay with provenance capture, as an efficient and
non-invasive method for computing this type of provenance. However, this
approach is limited to the snapshot isolation (SI) concurrency control protocol
while many real world applications apply the read committed version of snapshot
isolation (RC-SI) to improve performance at the cost of consistency. We present
non-trivial extensions of the model and reenactment approach to be able to
compute provenance of RC-SI transactions efficiently. In addition, we develop
techniques for applying reenactment across multiple RC-SI transactions. Our
experiments demonstrate that our implementation in the GProM system supports
efficient re-construction and querying of provenance.Comment: long versions of CIKM pape
Debugging transactions and understanding their execution are of immense importance for developing OLAP applications, to trace causes of errors in production systems, and to audit the operations of a database. However, debugging transactions is hard for several reasons: 1) after the execution of a transaction, its input is no longer available for debugging, 2) internal states of a transaction are typically not accessible, and 3) the execution of a transaction may be affected by concurrently running transactions. We present a debugger for transactions that enables non-invasive, postmortem debugging of transactions with provenance tracking and supports what-if scenarios (changes to transaction code or data). Using reenactment, a declarative replay technique we have developed, a transaction is replayed over the state of the DB seen by its original execution including all its interactions with concurrently executed transactions from the history. Importantly, our approach uses the temporal database and audit logging capabilities available in many DBMS and does not require any modifications to the underlying database system nor transactional workload.
We introduce historical what-if queries, a novel type of what-if analysis that determines the effect of a hypothetical change to the transactional history of a database. For example, "how would revenue be affected if we would have charged an additional $6 for shipping?" Such queries may lead to more actionable insights than traditional what-if queries as their results can be used to inform future actions, e.g., increasing shipping fees. We develop efficient techniques for answering historical what-if queries, i.e., determining how a modified history affects the current database state. Our techniques are based on reenactment, a replay technique for transactional histories. We optimize this process using program and data slicing techniques that determine which updates and what data can be excluded from reenactment without affecting the result. Using an implementation of our techniques in Mahif (a Middleware for Answering Historical what-IF queries) we demonstrate their effectiveness experimentally.
CCS CONCEPTS• Information systems → Data provenance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.