Data provenance (keeping track of who did what, where, when and how) boasts of various attractive use cases for distributed systems, such as intrusion detection, forensic analysis and secure information dependability. This potential, however, can only be realized if provenance is accessible by its primary stakeholders: the end-users. Existing provenance systems are designed in a 'all-or-nothing' fashion, making provenance inaccessible, difficult to extract and crucially, not controlled by its key stakeholders. To mitigate this, we propose that provenance be separated into system, data-specific and file-metadata provenance. Furthermore, we expand data-specific provenance as changes at a fine-grain level, or provenance-perchange, that is recorded alongside its source. We show that with the use of delta-encoding, provenance-per-change is viable, asserting our proposed architecture to be effectively realizable.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.