2019 15th International Conference on eScience (eScience) 2019
DOI: 10.1109/escience.2019.00093
|View full text |Cite
|
Sign up to set email alerts
|

Contextual Linking between Workflow Provenance and System Performance Logs

Abstract: When executing scientific workflows, anomalies of the workflow behavior are often caused by different issues such as resource failures at the underlying infrastructure. The provenance information collected by workflow management systems only captures the transformation of data at the workflow level. Analyzing provenance information and apposite system metrics requires expertise and manual effort. Moreover, it is often timeconsuming to aggregate this information and correlate events occurring at different level… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
1
1

Relationship

5
1

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…Metadata descriptions of research assets are not limited to 'characteristic' information; provenance data (which might be structured according to a standard such as PROV-O) for data products and processes are also an important target for semantic linking, especially for creating unified (or at least unifiable) records of how research assets are used and where they came from; such records may be generated from scientific workflow management systems with provenance support [35,42]. Such systems remain important for reproducible data science; most scientific investigations must follow a clear workflow, and there have been a number of workflow management systems developed with different characteristics and target applications [36], several of which have been applied to data science [37].…”
Section: Discussionmentioning
confidence: 99%
“…Metadata descriptions of research assets are not limited to 'characteristic' information; provenance data (which might be structured according to a standard such as PROV-O) for data products and processes are also an important target for semantic linking, especially for creating unified (or at least unifiable) records of how research assets are used and where they came from; such records may be generated from scientific workflow management systems with provenance support [35,42]. Such systems remain important for reproducible data science; most scientific investigations must follow a clear workflow, and there have been a number of workflow management systems developed with different characteristics and target applications [36], several of which have been applied to data science [37].…”
Section: Discussionmentioning
confidence: 99%
“…A workflow can be seen as a directed graph where each node is a function or module with input, output, and parameters, and each of the edges is a transfer of information [26] [27]. Elias et al [28] created a framework, cross-context workflow execution analyzer (CWEA), to connect provenance data with system logs and visualize the combination to aid scientists in detecting anomalies. Souza et al [29] used workflow provenance data to build a holistic view of the life cycle of scientific machine learning models.…”
Section: B Provenancementioning
confidence: 99%
“…A Cross-context Workflow Execution Analyser (CWEA) is developed for users to effectively investigate possible workflow execution anomalies or bottlenecks by combining provenance with available system metrics [41]. The tool is able to retrieve available system logs of the particular machines (virtual machines if in Cloud) and align them with the provenance provided by the workflow management system.…”
Section: Provenance and System Logsmentioning
confidence: 99%