2009
DOI: 10.1109/icde.2009.103
|View full text |Cite
|
Sign up to set email alerts
|

Differencing Provenance in Scientific Workflows

Abstract: Abstract-Scientific workflow management systems are increasingly providing the ability to manage and query the provenance of data products. However, the problem of differencing the provenance of two data products produced by executions of the same specification has not been adequately addressed. Although this problem is NP-hard for general workflow specifications, an analysis of real scientific (and business) workflows shows that their specifications can be captured as series-parallel graphs overlaid with well… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
34
0
1

Year Published

2010
2010
2017
2017

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 47 publications
(35 citation statements)
references
References 20 publications
0
34
0
1
Order By: Relevance
“…Our approach allows users to share their experiments in the spirit of [39], understand and compare their results [40] and possibly refine their analysis process to augment quality of their data sets. To do so, we have followed the recommendations and current standards on provenance [41] [19] and introduced a generator of IPython/Jupyter notebooks [2].…”
Section: Discussionmentioning
confidence: 99%
“…Our approach allows users to share their experiments in the spirit of [39], understand and compare their results [40] and possibly refine their analysis process to augment quality of their data sets. To do so, we have followed the recommendations and current standards on provenance [41] [19] and introduced a generator of IPython/Jupyter notebooks [2].…”
Section: Discussionmentioning
confidence: 99%
“…Answering these questions requires (inductive or recursive) querying along dependency paths of unknown length, making them computationally expensive [10], [7], [13] as they involve the transitive closure of data dependencies ddep. A particular challenge of our collaborative e-Science scenario is to ensure that the traces (e.g., T A and T B in Figs.…”
Section: B Queries On the Provenance Modelmentioning
confidence: 99%
“…Our model is similar to the one in [7] in that they also use a graph homomorphism to define when a trace (called run there) T is valid w.r.t. a given workflow graph W .…”
Section: A Abstract Workflow and Provenance Modelmentioning
confidence: 99%
See 1 more Smart Citation
“…Bao et al [15] introduce an algorithm for differencing provenance (due to workflow execution). The difference or edit distance between a pair of valid runs of the same specification is defined as a minimum cost sequence of edit operations that transform one run to the other.…”
Section: Querying Provenancementioning
confidence: 99%